Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/benchmark.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](zh/benchmark.md)

# Benchmark

FastDeploy extends the [vLLM benchmark](https://github.com/vllm-project/vllm/blob/main/benchmarks/) script with additional metrics, enabling more detailed performance benchmarking for FastDeploy.
Expand Down
3 changes: 1 addition & 2 deletions docs/best_practices/ERNIE-4.5-0.3B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/ERNIE-4.5-0.3B-Paddle.md)

# ERNIE-4.5-0.3B
## Environmental Preparation
### 1.1 Hardware requirements
Expand Down Expand Up @@ -90,4 +88,5 @@ export FD_SAMPLING_CLASS=rejection
```

## FAQ

If you encounter any problems during use, you can refer to [FAQ](./FAQ.md).
2 changes: 0 additions & 2 deletions docs/best_practices/ERNIE-4.5-21B-A3B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/ERNIE-4.5-21B-A3B-Paddle.md)

# ERNIE-4.5-21B-A3B
## Environmental Preparation
### 1.1 Hardware requirements
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/ERNIE-4.5-21B-A3B-Thinking.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/ERNIE-4.5-21B-A3B-Thinking.md)

# ERNIE-4.5-21B-A3B
## Environmental Preparation
### 1.1 Hardware requirements
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/ERNIE-4.5-300B-A47B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/ERNIE-4.5-300B-A47B-Paddle.md)

# ERNIE-4.5-300B-A47B
## Environmental Preparation
### 1.1 Hardware requirements
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/ERNIE-4.5-VL-28B-A3B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/ERNIE-4.5-VL-28B-A3B-Paddle.md)

# ERNIE-4.5-VL-28B-A3B-Paddle

## 1. Environment Preparation
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/ERNIE-4.5-VL-424B-A47B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/ERNIE-4.5-VL-424B-A47B-Paddle.md)

# ERNIE-4.5-VL-424B-A47B-Paddle

## 1. Environment Preparation
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/FAQ.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/FAQ.md)

# FAQ
## 1.CUDA out of memory
1. when starting the service:
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/PaddleOCR-VL-0.9B.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/PaddleOCR-VL-0.9B.md)

# PaddleOCR-VL-0.9B

## 1. Environment Preparation
Expand Down
2 changes: 0 additions & 2 deletions docs/best_practices/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/best_practices/README.md)

# Optimal Deployment

- [ERNIE-4.5-0.3B-Paddle.md](ERNIE-4.5-0.3B-Paddle.md)
Expand Down
1 change: 0 additions & 1 deletion docs/cli/bench.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# bench: Benchmark Testing

## 1. bench latency: Offline Latency Test

### Parameters
Expand Down
2 changes: 0 additions & 2 deletions docs/features/chunked_prefill.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/chunked_prefill.md)

# Chunked Prefill

Chunked Prefill employs a segmentation strategy that breaks down Prefill requests into smaller subtasks, which are then batched together with Decode requests. This approach better balances compute-intensive (Prefill) and memory-intensive (Decode) operations, optimizes GPU resource utilization, reduces computational overhead and memory footprint per Prefill, thereby lowering peak memory usage and avoiding out-of-memory issues.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/data_parallel_service.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/data_parallel_service.md)

# Data Parallelism
Under the MOE model, enabling Expert Parallelism (EP) combined with Data Parallelism (DP), where EP distributes expert workloads and DP enables parallel request processing.

Expand Down
2 changes: 0 additions & 2 deletions docs/features/disaggregated.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/disaggregated.md)

# Disaggregated Deployment

Large model inference consists of two phases: Prefill and Decode, which are compute-intensive and memory access-intensive respectively. Deploying Prefill and Decode separately in certain scenarios can improve hardware utilization, effectively increase throughput, and reduce overall sentence latency.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/early_stop.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/early_stop.md)

# Early Stopping

The early stopping is used to prematurely terminate the token generation of the model. Specifically, the early stopping uses different strategies to determine whether the currently generated token sequence meets the early stopping criteria. If so, token generation is terminated prematurely. FastDeploy currently supports the repetition strategy and stop sequence.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/graph_optimization.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/graph_optimization.md)

# Graph optimization technology in FastDeploy

FastDeploy's `GraphOptimizationBackend` integrates a variety of graph optimization technologies:
Expand Down
2 changes: 0 additions & 2 deletions docs/features/load_balance.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/load_balance.md)

# Global Scheduler: Multi-Instance Load Balancing

## Design Overview
Expand Down
2 changes: 0 additions & 2 deletions docs/features/multi-node_deployment.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/multi-node_deployment.md)

# Multi-Node Deployment

## Overview
Expand Down
2 changes: 0 additions & 2 deletions docs/features/plas_attention.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/plas_attention.md)

# PLAS

## Introduction
Expand Down
2 changes: 0 additions & 2 deletions docs/features/plugins.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/plugins.md)

# FastDeploy Plugin Mechanism Documentation

FastDeploy supports a plugin mechanism that allows users to extend functionality without modifying the core code. Plugins are automatically discovered and loaded through Python's `entry_points` mechanism.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/prefix_caching.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/prefix_caching.md)

# Prefix Caching

Prefix Caching is a technique to optimize the inference efficiency of generative models. Its core idea is to cache intermediate computation results (KV Cache) of input sequences, avoiding redundant computations and thereby accelerating response times for multiple requests sharing the same prefix.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/reasoning_output.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/reasoning_output.md)

# Reasoning Outputs

Reasoning models return an additional `reasoning_content` field in their output, which contains the reasoning steps that led to the final conclusion.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/sampling.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/sampling.md)

# Sampling Strategies

Sampling strategies are used to determine how to select the next token from the output probability distribution of a model. FastDeploy currently supports multiple sampling strategies including Top-p, Top-k_Top-p, and Min-p Sampling.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/speculative_decoding.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/speculative_decoding.md)

# 🔮 Speculative Decoding

This project implements an efficient **Speculative Decoding** inference framework based on PaddlePaddle. It supports **Multi-Token Proposing (MTP)** to accelerate large language model (LLM) generation, significantly reducing latency and improving throughput.
Expand Down
2 changes: 0 additions & 2 deletions docs/features/structured_outputs.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/features/structured_outputs.md)

# Structured Outputs

## Overview
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/get_started/README.md)

# Get Started

- [Deploy ERNIE-4.5-0.3B-Paddle in 10 Minutes](quick_start.md)
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/ernie-4.5-vl.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/get_started/ernie-4.5-vl.md)

# Deploy ERNIE-4.5-VL-424B-A47B Multimodal Model

This document explains how to deploy the ERNIE-4.5-VL multimodal model, which supports users to interact with the model using multimodal data (including reasoning capabilities). Before starting the deployment, please ensure that your hardware environment meets the following requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/ernie-4.5.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/get_started/ernie-4.5.md)

# Deploy ERNIE-4.5-300B-A47B Model

This document explains how to deploy the ERNIE-4.5 model. Before starting the deployment, please ensure that your hardware environment meets the following requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/Enflame_gcu.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/Enflame_gcu.md)

# Running ERNIE 4.5 Series Models with FastDeploy

The Enflame S60 ([Learn about Enflame](https://www.enflame-tech.com/)) is a next-generation AI inference accelerator card designed for large-scale deployment in data centers. It meets the demands of large language models (LLMs), search/advertising/recommendation systems, and traditional models. Characterized by broad model coverage, user-friendliness, and high portability, it is widely applicable to mainstream inference scenarios such as image and text generation applications, search and recommendation systems, and text/image/speech recognition.
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/README.md)

# FastDeploy Installation

FastDeploy currently supports installation on the following hardware platforms:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/hygon_dcu.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/hygon_dcu.md)

# Run ERNIE-4.5-300B-A47B & ERNIE-4.5-21B-A3B model on hygon machine
The current version of the software merely serves as a demonstration demo for the hygon k100AI combined with the Fastdeploy inference framework for large models. There may be issues when running the latest ERNIE4.5 model, and we will conduct repairs and performance optimization in the future. Subsequent versions will provide customers with a more stable version.

Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/iluvatar_gpu.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/iluvatar_gpu.md)

# Run ERNIE-4.5-300B-A47B & ERNIE-4.5-21B-A3B model on iluvatar machine

## Machine Preparation
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/intel_gaudi.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/intel_gaudi.md)

# Intel Gaudi Installation for running ERNIE 4.5 Series Models

The following installation methods are available when your environment meets these requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/kunlunxin_xpu.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/kunlunxin_xpu.md)

# Kunlunxin XPU

## Requirements
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/metax_gpu.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/metax_gpu.md)

# Metax GPU Installation for running ERNIE 4.5 Series Models

The following installation methods are available when your environment meets these requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/installation/nvidia_gpu.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../../zh/get_started/installation/nvidia_gpu.md)

# NVIDIA CUDA GPU Installation

The following installation methods are available when your environment meets these requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/quick_start.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/get_started/quick_start.md)

# Deploy ERNIE-4.5-0.3B-Paddle in 10 Minutes

Before deployment, ensure your environment meets the following requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/quick_start_qwen.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/get_started/quick_start_qwen.md)

# Deploy QWEN3-0.6b in 10 Minutes

Before deployment, ensure your environment meets the following requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/get_started/quick_start_vl.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/get_started/quick_start_vl.md)

# Deploy ERNIE-4.5-VL-28B-A3B-Paddle Multimodal Model in 10 Minutes

Before deployment, please ensure your environment meets the following requirements:
Expand Down
2 changes: 0 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](zh/index.md)

# FastDeploy

**FastDeploy** is an inference and deployment toolkit for large language models and visual language models based on PaddlePaddle. It delivers **production-ready, out-of-the-box deployment solutions** with core acceleration technologies:
Expand Down
2 changes: 0 additions & 2 deletions docs/offline_inference.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](zh/offline_inference.md)

# Offline Inference

## 1. Usage
Expand Down
2 changes: 0 additions & 2 deletions docs/online_serving/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/online_serving/README.md)

# OpenAI Protocol-Compatible API Server

FastDeploy provides a service-oriented deployment solution that is compatible with the OpenAI protocol. Users can quickly deploy it using the following command:
Expand Down
2 changes: 0 additions & 2 deletions docs/online_serving/graceful_shutdown_service.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/online_serving/graceful_shutdown_service.md)

# Graceful Service Node Shutdown Solution

## 1. Core Objective
Expand Down
2 changes: 0 additions & 2 deletions docs/online_serving/metrics.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/online_serving/metrics.md)

# Monitoring Metrics

After FastDeploy is launched, it supports continuous monitoring of the FastDeploy service status through Metrics. When starting FastDeploy, you can specify the port for the Metrics service by configuring the `metrics-port` parameter.
Expand Down
2 changes: 0 additions & 2 deletions docs/online_serving/scheduler.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/online_serving/scheduler.md)

# Scheduler

FastDeploy currently supports two types of schedulers: **Local Scheduler** and **Global Scheduler**. The Global Scheduler is designed for large-scale clusters, enabling secondary load balancing across nodes based on real-time workload metrics.
Expand Down
2 changes: 0 additions & 2 deletions docs/parameters.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](zh/parameters.md)

# FastDeploy Parameter Documentation

## Parameter Description
Expand Down
2 changes: 0 additions & 2 deletions docs/quantization/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/quantization/README.md)

# Quantization

FastDeploy supports various quantization inference precisions including FP8, INT8, INT4, 2-bits, etc. It supports different precision inference for weights, activations, and KVCache tensors, which can meet the inference requirements of different scenarios such as low cost, low latency, and long context.
Expand Down
2 changes: 0 additions & 2 deletions docs/quantization/online_quantization.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/quantization/online_quantization.md)

# Online Quantization

Online quantization refers to the inference engine quantizing weights after loading BF16 weights, rather than loading pre-quantized low-precision weights. FastDeploy supports online quantization of BF16 to various precisions, including: INT4, INT8, and FP8.
Expand Down
2 changes: 0 additions & 2 deletions docs/quantization/wint2.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/quantization/wint2.md)

# WINT2 Quantization

Weights are compressed offline using the [CCQ (Convolutional Coding Quantization)](https://arxiv.org/pdf/2507.07145) method. The actual stored numerical type of weights is INT8, with 4 weights packed into each INT8 value, equivalent to 2 bits per weight. Activations are not quantized. During inference, weights are dequantized and decoded in real-time to BF16 numerical type, and calculations are performed using BF16 numerical type.
Expand Down
2 changes: 0 additions & 2 deletions docs/supported_models.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](zh/supported_models.md)

# Supported Models

FastDeploy currently supports the following models, which can be downloaded automatically during FastDeploy deployment.Specify the ``model`` parameter as the model name in the table below to automatically download model weights (all supports resumable downloads). The following three download sources are supported:
Expand Down
2 changes: 0 additions & 2 deletions docs/usage/code_overview.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/usage/code_overview.md)

# Code Overview

Below is an overview of the FastDeploy code structure and functionality organized by directory.
Expand Down
2 changes: 0 additions & 2 deletions docs/usage/environment_variables.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/usage/environment_variables.md)

# FastDeploy Environment Variables

FastDeploy's environment variables are defined in `fastdeploy/envs.py` at the root of the repository. Below is the documentation:
Expand Down
2 changes: 0 additions & 2 deletions docs/usage/fastdeploy_unit_test_guide.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/usage/fastdeploy_unit_test_guide.md)

# FastDeploy Unit Test Specification
1. Test Naming Conventions
- Test files must start with test_.
Expand Down
2 changes: 0 additions & 2 deletions docs/usage/kunlunxin_xpu_deployment.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/usage/kunlunxin_xpu_deployment.md)

## Supported Models
|Model Name|Context Length|Quantization|XPUs Required|Deployment Commands|Applicable Version|
|-|-|-|-|-|-|
Expand Down
2 changes: 0 additions & 2 deletions docs/usage/log.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[简体中文](../zh/usage/log.md)

# Log Description

FastDeploy generates the following log files during deployment. Below is an explanation of each log's purpose.
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/benchmark.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../benchmark.md)

# Benchmark

FastDeploy基于[vLLM benchmark](https://github.com/vllm-project/vllm/blob/main/benchmarks/)脚本,增加了部分统计信息,可用于benchmark FastDeploy更详细的性能指标。
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/ERNIE-4.5-0.3B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/ERNIE-4.5-0.3B-Paddle.md)

# ERNIE-4.5-0.3B
## 一、环境准备
### 1.1 支持情况
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/ERNIE-4.5-21B-A3B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/ERNIE-4.5-21B-A3B-Paddle.md)

# ERNIE-4.5-21B-A3B
## 一、环境准备
### 1.1 支持情况
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/ERNIE-4.5-21B-A3B-Thinking.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/ERNIE-4.5-21B-A3B-Thinking.md)

# ERNIE-4.5-21B-A3B-Thinking
## 一、环境准备
### 1.1 支持情况
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/ERNIE-4.5-300B-A47B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/ERNIE-4.5-300B-A47B-Paddle.md)

# ERNIE-4.5-300B-A47B
## 一、环境准备
### 1.1 支持情况
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/ERNIE-4.5-VL-28B-A3B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/ERNIE-4.5-VL-28B-A3B-Paddle.md)

# ERNIE-4.5-VL-28B-A3B-Paddle

## 一、环境准备
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/ERNIE-4.5-VL-424B-A47B-Paddle.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/ERNIE-4.5-VL-424B-A47B-Paddle.md)

# ERNIE-4.5-VL-424B-A47B-Paddle

## 一、环境准备
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/FAQ.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/FAQ.md)

# 常见问题FAQ
## 1.显存不足
1. 启动服务时显存不足:
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/PaddleOCR-VL-0.9B.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/PaddleOCR-VL-0.9B.md)

# PaddleOCR-VL-0.9B

## 一、环境准备
Expand Down
2 changes: 0 additions & 2 deletions docs/zh/best_practices/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../best_practices/README.md)

# 最佳实践

- [ERNIE-4.5-0.3B-Paddle.md](ERNIE-4.5-0.3B-Paddle.md)
Expand Down
Loading
Loading