首次提交

2025-03-24 11:19:28 +08:00 · 2025-03-24 11:19:28 +08:00 · 1b2db41c7e
commit 1b2db41c7e
1394 changed files with 868478 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@ -0,0 +1 @@
+*.sh text eol=lf
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,43 @@
+# Generated by Cargo
+# will have compiled files and executables
+debug/
+target/
+__pycache__/
+hudet/
+cv/
+layout_app.py
+api/flask_session
+
+# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
+# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
+Cargo.lock
+
+# These are backup files generated by rustfmt
+**/*.rs.bk
+
+# MSVC Windows builds of rustc generate these, which store debugging information
+*.pdb
+*.trie
+
+.idea/
+.vscode/
+
+# Exclude Mac generated files
+.DS_Store
+
+# Exclude the log folder
+docker/ragflow-logs/
+/flask_session
+/logs
+rag/res/deepdoc
+
+# Exclude sdk generated files
+sdk/python/ragflow.egg-info/
+sdk/python/build/
+sdk/python/dist/
+sdk/python/ragflow_sdk.egg-info/
+huggingface.co/
+nltk_data/
+
+# Exclude hash-like temporary files like 9b5ad71b2ce5302211f9c61530b329a4922fc6a4
+*[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]*
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -0,0 +1,48 @@
+# Contribution guidelines
+
+This document offers guidelines and major considerations for submitting your contributions to RAGFlow.
+
+- To report a bug, file a [GitHub issue](https://github.com/infiniflow/ragflow/issues/new/choose) with us.
+- For further questions, you can explore existing discussions or initiate a new one in [Discussions](https://github.com/orgs/infiniflow/discussions).
+
+## What you can contribute
+
+The list below mentions some contributions you can make, but it is not a complete list.
+
+- Proposing or implementing new features
+- Fixing a bug
+- Adding test cases or demos
+- Posting a blog or tutorial
+- Updates to existing documents, codes, or annotations.
+- Suggesting more user-friendly error codes
+
+## File a pull request (PR)
+
+### General workflow
+
+1. Fork our GitHub repository.
+2. Clone your fork to your local machine:
+`git clone git@github.com:<yourname>/ragflow.git`
+3. Create a local branch: 
+`git checkout -b my-branch`
+4. Provide sufficient information in your commit message
+`git commit -m 'Provide sufficient info in your commit message'`
+5. Commit changes to your local branch, and push to GitHub: (include necessary commit message)
+`git push origin my-branch.`
+6. Submit a pull request for review.
+
+### Before filing a PR
+
+- Consider splitting a large PR into multiple smaller, standalone PRs to keep a traceable development history.
+- Ensure that your PR addresses just one issue, or keep any unrelated changes small.
+- Add test cases when contributing new features. They demonstrate that your code functions correctly and protect against potential issues from future changes.
+
+### Describing your PR
+
+- Ensure that your PR title is concise and clear, providing all the required information.
+- Refer to a corresponding GitHub issue in your PR description if applicable.
+- Include sufficient design details for *breaking changes* or *API changes* in your description.
+
+### Reviewing & merging a PR
+
+Ensure that your PR passes all Continuous Integration (CI) tests before merging it.
--- a/210
+++ b/210
@ -0,0 +1,210 @@
+# base stage
+FROM ubuntu:22.04 AS base
+USER root
+SHELL ["/bin/bash", "-c"]
+
+ARG NEED_MIRROR=0
+ARG LIGHTEN=0
+ENV LIGHTEN=${LIGHTEN}
+
+WORKDIR /ragflow
+
+# Copy models downloaded via download_deps.py
+RUN mkdir -p /ragflow/rag/res/deepdoc /root/.ragflow
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/huggingface.co,target=/huggingface.co \
+    cp /huggingface.co/InfiniFlow/huqie/huqie.txt.trie /ragflow/rag/res/ && \
+    tar --exclude='.*' -cf - \
+        /huggingface.co/InfiniFlow/text_concat_xgb_v1.0 \
+        /huggingface.co/InfiniFlow/deepdoc \
+        | tar -xf - --strip-components=3 -C /ragflow/rag/res/deepdoc 
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/huggingface.co,target=/huggingface.co \
+    if [ "$LIGHTEN" != "1" ]; then \
+        (tar -cf - \
+            /huggingface.co/BAAI/bge-large-zh-v1.5 \
+            /huggingface.co/BAAI/bge-reranker-v2-m3 \
+            /huggingface.co/maidalun1020/bce-embedding-base_v1 \
+            /huggingface.co/maidalun1020/bce-reranker-base_v1 \
+            | tar -xf - --strip-components=2 -C /root/.ragflow) \
+    fi
+
+# https://github.com/chrismattmann/tika-python
+# This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
+    cp -r /deps/nltk_data /root/ && \
+    cp /deps/tika-server-standard-3.0.0.jar /deps/tika-server-standard-3.0.0.jar.md5 /ragflow/ && \
+    cp /deps/cl100k_base.tiktoken /ragflow/9b5ad71b2ce5302211f9c61530b329a4922fc6a4
+
+ENV TIKA_SERVER_JAR="file:///ragflow/tika-server-standard-3.0.0.jar"
+ENV DEBIAN_FRONTEND=noninteractive
+
+# Setup apt
+# Python package and implicit dependencies:
+# opencv-python: libglib2.0-0 libglx-mesa0 libgl1
+# aspose-slides: pkg-config libicu-dev libgdiplus         libssl1.1_1.1.1f-1ubuntu2_amd64.deb
+# python-pptx:   default-jdk                              tika-server-standard-3.0.0.jar
+# selenium:      libatk-bridge2.0-0                       chrome-linux64-121-0-6167-85
+# Building C extensions: libpython3-dev libgtk-4-1 libnss3 xdg-utils libgbm-dev
+RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
+    if [ "$NEED_MIRROR" == "1" ]; then \
+        sed -i 's|http://archive.ubuntu.com|https://mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list; \
+    fi; \
+    rm -f /etc/apt/apt.conf.d/docker-clean && \
+    echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache && \
+    chmod 1777 /tmp && \
+    apt update && \
+    apt --no-install-recommends install -y ca-certificates && \
+    apt update && \
+    apt install -y libglib2.0-0 libglx-mesa0 libgl1 && \
+    apt install -y pkg-config libicu-dev libgdiplus && \
+    apt install -y default-jdk && \
+    apt install -y libatk-bridge2.0-0 && \
+    apt install -y libpython3-dev libgtk-4-1 libnss3 xdg-utils libgbm-dev && \
+    apt install -y python3-pip pipx nginx unzip curl wget git vim less
+
+RUN if [ "$NEED_MIRROR" == "1" ]; then \
+        pip3 config set global.index-url https://mirrors.aliyun.com/pypi/simple && \
+        pip3 config set global.trusted-host mirrors.aliyun.com; \
+        mkdir -p /etc/uv && \
+        echo "[[index]]" > /etc/uv/uv.toml && \
+        echo 'url = "https://mirrors.aliyun.com/pypi/simple"' >> /etc/uv/uv.toml && \
+        echo "default = true" >> /etc/uv/uv.toml; \
+    fi; \
+    pipx install uv
+
+ENV PYTHONDONTWRITEBYTECODE=1 DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1
+ENV PATH=/root/.local/bin:$PATH
+
+# nodejs 12.22 on Ubuntu 22.04 is too old
+RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
+    curl -fsSL https://deb.nodesource.com/setup_20.x | bash - && \
+    apt purge -y nodejs npm cargo && \
+    apt autoremove -y && \
+    apt update && \
+    apt install -y nodejs
+
+# A modern version of cargo is needed for the latest version of the Rust compiler.
+RUN apt update && apt install -y curl build-essential \
+    && if [ "$NEED_MIRROR" == "1" ]; then \
+         # Use TUNA mirrors for rustup/rust dist files
+         export RUSTUP_DIST_SERVER="https://mirrors.tuna.tsinghua.edu.cn/rustup"; \
+         export RUSTUP_UPDATE_ROOT="https://mirrors.tuna.tsinghua.edu.cn/rustup/rustup"; \
+         echo "Using TUNA mirrors for Rustup."; \
+       fi; \
+    # Force curl to use HTTP/1.1
+    curl --proto '=https' --tlsv1.2 --http1.1 -sSf https://sh.rustup.rs | bash -s -- -y --profile minimal \
+    && echo 'export PATH="/root/.cargo/bin:${PATH}"' >> /root/.bashrc
+
+ENV PATH="/root/.cargo/bin:${PATH}"
+
+RUN cargo --version && rustc --version
+
+# Add msssql ODBC driver
+# macOS ARM64 environment, install msodbcsql18.
+# general x86_64 environment, install msodbcsql17.
+RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
+    curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
+    curl https://packages.microsoft.com/config/ubuntu/22.04/prod.list > /etc/apt/sources.list.d/mssql-release.list && \
+    apt update && \
+    arch="$(uname -m)"; \
+    if [ "$arch" = "arm64" ] || [ "$arch" = "aarch64" ]; then \
+        # ARM64 (macOS/Apple Silicon or Linux aarch64)
+        ACCEPT_EULA=Y apt install -y unixodbc-dev msodbcsql18; \
+    else \
+        # x86_64 or others
+        ACCEPT_EULA=Y apt install -y unixodbc-dev msodbcsql17; \
+    fi || \
+    { echo "Failed to install ODBC driver"; exit 1; }
+
+
+
+# Add dependencies of selenium
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/chrome-linux64-121-0-6167-85,target=/chrome-linux64.zip \
+    unzip /chrome-linux64.zip && \
+    mv chrome-linux64 /opt/chrome && \
+    ln -s /opt/chrome/chrome /usr/local/bin/
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/chromedriver-linux64-121-0-6167-85,target=/chromedriver-linux64.zip \
+    unzip -j /chromedriver-linux64.zip chromedriver-linux64/chromedriver && \
+    mv chromedriver /usr/local/bin/ && \
+    rm -f /usr/bin/google-chrome
+
+# https://forum.aspose.com/t/aspose-slides-for-net-no-usable-version-of-libssl-found-with-linux-server/271344/13
+# aspose-slides on linux/arm64 is unavailable
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
+    if [ "$(uname -m)" = "x86_64" ]; then \
+        dpkg -i /deps/libssl1.1_1.1.1f-1ubuntu2_amd64.deb; \
+    elif [ "$(uname -m)" = "aarch64" ]; then \
+        dpkg -i /deps/libssl1.1_1.1.1f-1ubuntu2_arm64.deb; \
+    fi
+
+
+# builder stage
+FROM base AS builder
+USER root
+
+WORKDIR /ragflow
+
+# install dependencies from uv.lock file
+COPY pyproject.toml uv.lock ./
+
+# https://github.com/astral-sh/uv/issues/10462
+# uv records index url into uv.lock but doesn't failover among multiple indexes
+RUN --mount=type=cache,id=ragflow_uv,target=/root/.cache/uv,sharing=locked \
+    if [ "$NEED_MIRROR" == "1" ]; then \
+        sed -i 's|pypi.org|mirrors.aliyun.com/pypi|g' uv.lock; \
+    else \
+        sed -i 's|mirrors.aliyun.com/pypi|pypi.org|g' uv.lock; \
+    fi; \
+    if [ "$LIGHTEN" == "1" ]; then \
+        uv sync --python 3.10 --frozen; \
+    else \
+        uv sync --python 3.10 --frozen --all-extras; \
+    fi
+
+COPY web web
+COPY docs docs
+RUN --mount=type=cache,id=ragflow_npm,target=/root/.npm,sharing=locked \
+    cd web && npm install && npm run build
+
+COPY .git /ragflow/.git
+
+RUN version_info=$(git describe --tags --match=v* --first-parent --always); \
+    if [ "$LIGHTEN" == "1" ]; then \
+        version_info="$version_info slim"; \
+    else \
+        version_info="$version_info full"; \
+    fi; \
+    echo "RAGFlow version: $version_info"; \
+    echo $version_info > /ragflow/VERSION
+
+# production stage
+FROM base AS production
+USER root
+
+WORKDIR /ragflow
+
+# Copy Python environment and packages
+ENV VIRTUAL_ENV=/ragflow/.venv
+COPY --from=builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}
+ENV PATH="${VIRTUAL_ENV}/bin:${PATH}"
+
+ENV PYTHONPATH=/ragflow/
+
+COPY web web
+COPY api api
+COPY conf conf
+COPY deepdoc deepdoc
+COPY rag rag
+COPY agent agent
+COPY graphrag graphrag
+COPY agentic_reasoning agentic_reasoning
+COPY pyproject.toml uv.lock ./
+
+COPY docker/service_conf.yaml.template ./conf/service_conf.yaml.template
+COPY docker/entrypoint.sh docker/entrypoint-parser.sh ./
+RUN chmod +x ./entrypoint*.sh
+
+# Copy compiled web pages
+COPY --from=builder /ragflow/web/dist /ragflow/web/dist
+
+COPY --from=builder /ragflow/VERSION /ragflow/VERSION
+ENTRYPOINT ["./entrypoint.sh"]
--- a/Dockerfile.deps
+++ b/Dockerfile.deps
@ -0,0 +1,10 @@
+# This builds an image that contains the resources needed by Dockerfile
+#
+FROM scratch
+
+# Copy resources downloaded via download_deps.py
+COPY chromedriver-linux64-121-0-6167-85 chrome-linux64-121-0-6167-85 cl100k_base.tiktoken libssl1.1_1.1.1f-1ubuntu2_amd64.deb libssl1.1_1.1.1f-1ubuntu2_arm64.deb tika-server-standard-3.0.0.jar tika-server-standard-3.0.0.jar.md5 libssl*.deb /
+
+COPY nltk_data /nltk_data
+
+COPY huggingface.co /huggingface.co
--- a/Dockerfile.scratch.oc9
+++ b/Dockerfile.scratch.oc9
@ -0,0 +1,60 @@
+FROM opencloudos/opencloudos:9.0
+USER root
+
+WORKDIR /ragflow
+
+RUN dnf update -y && dnf install -y wget curl gcc-c++ openmpi-devel
+
+RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh && \
+    bash ~/miniconda.sh -b -p /root/miniconda3 && \
+    rm ~/miniconda.sh && ln -s /root/miniconda3/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
+    echo ". /root/miniconda3/etc/profile.d/conda.sh" >> ~/.bashrc && \
+    echo "conda activate base" >> ~/.bashrc
+
+ENV PATH /root/miniconda3/bin:$PATH
+
+RUN conda create -y --name py11 python=3.11
+
+ENV CONDA_DEFAULT_ENV py11
+ENV CONDA_PREFIX /root/miniconda3/envs/py11
+ENV PATH $CONDA_PREFIX/bin:$PATH
+
+# RUN curl -sL https://rpm.nodesource.com/setup_14.x | bash -
+RUN dnf install -y nodejs
+
+RUN dnf install -y nginx
+
+ADD ./web ./web
+ADD ./api ./api
+ADD ./docs ./docs
+ADD ./conf ./conf
+ADD ./deepdoc ./deepdoc
+ADD ./rag ./rag
+ADD ./requirements.txt ./requirements.txt
+ADD ./agent ./agent
+ADD ./graphrag ./graphrag
+
+RUN dnf install -y openmpi openmpi-devel python3-openmpi
+ENV C_INCLUDE_PATH /usr/include/openmpi-x86_64:$C_INCLUDE_PATH
+ENV LD_LIBRARY_PATH /usr/lib64/openmpi/lib:$LD_LIBRARY_PATH
+RUN rm /root/miniconda3/envs/py11/compiler_compat/ld
+RUN cd ./web && npm i && npm run build
+RUN conda run -n py11 pip install $(grep -ivE "mpi4py" ./requirements.txt) # without mpi4py==3.1.5
+RUN conda run -n py11 pip install redis
+
+RUN dnf update -y && \
+    dnf install -y glib2 mesa-libGL && \
+    dnf clean all
+
+RUN conda run -n py11 pip install ollama
+RUN conda run -n py11 python -m nltk.downloader punkt
+RUN conda run -n py11 python -m nltk.downloader wordnet
+
+ENV PYTHONPATH=/ragflow/
+ENV HF_ENDPOINT=https://hf-mirror.com
+
+COPY docker/service_conf.yaml.template ./conf/service_conf.yaml.template
+ADD docker/entrypoint.sh ./entrypoint.sh
+RUN chmod +x ./entrypoint.sh
+
+ENTRYPOINT ["./entrypoint.sh"]
--- a/201
+++ b/201
@ -0,0 +1,201 @@
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/README.md
+++ b/README.md
@ -0,0 +1,41 @@
+# Ragflow-Plus
+
+## 项目介绍
+
+在原有 Ragflow 的基础中，该项目做了一些二开，以解决实际应用中的一些痛点。
+
+## 名字说明
+Ragflow-Plus，该名字不是说比 Ragflow 项目牛的意思，而是对标 Dify-Plus 作为 Ragflow 的二开项目，以解决行业应用时共同的痛点问题。
+
+## 新增功能介绍
+
+### 一. 用户批量注册/批量加入团队
+隐藏了原本用户注册的功能，改为管理员通过后台批量注册，并加入管理员团队，可共享团队知识库及默认模型配置
+
+### 二. 优化对话显示
+微调了对话界面的样式，使其观感更为友好
+
+### 三. 文档撰写功能
+新增文档撰写全新的交互方式，支持直接导出为 Word 文档
+
+新增功能的详细使用说明，在我的微信公众号[我有一计]中，对应有相应文章详细介绍。
+
+
+## TODO
+
+- [ ] 用户批量注册可视化后台管理
+
+- [ ] 文档撰写插入图片
+
+- [ ] 知识库批量上传解析
+
+## 交流群
+如果有其它需求或问题建议，可加入交流群进行讨论
+
+
+## License
+
+版权说明：本项目在 Ragflow 项目基础上进行二开，需要遵守 Ragflow 的开源协议，如下
+
+This repository is available under the [Ragflow
+ Open Source License](LICENSE), which is essentially Apache 2.0 with a few additional restrictions.
--- a/agent/README.md
+++ b/agent/README.md
@ -0,0 +1,45 @@
+English | [简体中文](./README_zh.md)
+
+# *Graph*
+
+
+## Introduction
+
+*Graph* is a mathematical concept which is composed of nodes and edges. 
+It is used to compose a complex work flow or agent. 
+And this graph is beyond the DAG that we can use circles to describe our agent or work flow.
+Under this folder, we propose a test tool ./test/client.py which can test the DSLs such as json files in folder ./test/dsl_examples.
+Please use this client at the same folder you start RAGFlow. If it's run by Docker, please go into the container before running the client.
+Otherwise, correct configurations in service_conf.yaml is essential.
+
+```bash
+PYTHONPATH=path/to/ragflow python graph/test/client.py -h
+usage: client.py [-h] -s DSL -t TENANT_ID -m
+
+options:
+  -h, --help            show this help message and exit
+  -s DSL, --dsl DSL     input dsl
+  -t TENANT_ID, --tenant_id TENANT_ID
+                        Tenant ID
+  -m, --stream          Stream output
+```
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/79179c5e-d4d6-464a-b6c4-5721cb329899" width="1000"/>
+</div>
+
+
+## How to gain a TENANT_ID in command line?
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/419d8588-87b1-4ab8-ac49-2d1f047a4b97" width="600"/>
+</div>
+💡 We plan to display it here in the near future.
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/c97915de-0091-46a5-afd9-e278946e5fe3" width="600"/>
+</div>
+
+
+## How to set 'kb_ids' for component 'Retrieval' in DSL?
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/0a731534-cac8-49fd-8a92-ca247eeef66d" width="600"/>
+</div>
+
--- a/agent/README_zh.md
+++ b/agent/README_zh.md
@ -0,0 +1,46 @@
+[English](./README.md) | 简体中文
+
+# *Graph*
+
+
+## 简介
+
+"Graph"是一个由节点和边组成的数学概念。
+它被用来构建复杂的工作流或代理。
+这个图超越了有向无环图（DAG），我们可以使用循环来描述我们的代理或工作流。
+在这个文件夹下，我们提出了一个测试工具 ./test/client.py，
+它可以测试像文件夹./test/dsl_examples下一样的DSL文件。
+请在启动 RAGFlow 的同一文件夹中使用此客户端。如果它是通过 Docker 运行的，请在运行客户端之前进入容器。
+否则，正确配置 service_conf.yaml 文件是必不可少的。
+
+```bash
+PYTHONPATH=path/to/ragflow python graph/test/client.py -h
+usage: client.py [-h] -s DSL -t TENANT_ID -m
+
+options:
+  -h, --help            show this help message and exit
+  -s DSL, --dsl DSL     input dsl
+  -t TENANT_ID, --tenant_id TENANT_ID
+                        Tenant ID
+  -m, --stream          Stream output
+```
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/05924730-c427-495b-8ee4-90b8b2250681" width="1000"/>
+</div>
+
+
+## 命令行中的TENANT_ID如何获得?
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/419d8588-87b1-4ab8-ac49-2d1f047a4b97" width="600"/>
+</div>
+💡 后面会展示在这里：
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/c97915de-0091-46a5-afd9-e278946e5fe3" width="600"/>
+</div>
+
+
+## DSL里面的Retrieval组件的kb_ids怎么填?
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/infiniflow/ragflow/assets/12318111/0a731534-cac8-49fd-8a92-ca247eeef66d" width="600"/>
+</div>
+
--- a/agent/init.py
+++ b/agent/init.py
@ -0,0 +1,18 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from beartype.claw import beartype_this_package
+beartype_this_package()
--- a/agent/canvas.py
+++ b/agent/canvas.py
@ -0,0 +1,365 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+import json
+from copy import deepcopy
+from functools import partial
+
+import pandas as pd
+
+from agent.component import component_class
+from agent.component.base import ComponentBase
+
+
+class Canvas:
+    """
+    dsl = {
+        "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {},
+                },
+                "downstream": ["answer_0"],
+                "upstream": [],
+            },
+            "answer_0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["retrieval_0"],
+                "upstream": ["begin", "generate_0"],
+            },
+            "retrieval_0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {}
+                },
+                "downstream": ["generate_0"],
+                "upstream": ["answer_0"],
+            },
+            "generate_0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {}
+                },
+                "downstream": ["answer_0"],
+                "upstream": ["retrieval_0"],
+            }
+        },
+        "history": [],
+        "messages": [],
+        "reference": [],
+        "path": [["begin"]],
+        "answer": []
+    }
+    """
+
+    def __init__(self, dsl: str, tenant_id=None):
+        self.path = []
+        self.history = []
+        self.messages = []
+        self.answer = []
+        self.components = {}
+        self.dsl = json.loads(dsl) if dsl else {
+            "components": {
+                "begin": {
+                    "obj": {
+                        "component_name": "Begin",
+                        "params": {
+                            "prologue": "Hi there!"
+                        }
+                    },
+                    "downstream": [],
+                    "upstream": [],
+                    "parent_id": ""
+                }
+            },
+            "history": [],
+            "messages": [],
+            "reference": [],
+            "path": [],
+            "answer": []
+        }
+        self._tenant_id = tenant_id
+        self._embed_id = ""
+        self.load()
+
+    def load(self):
+        self.components = self.dsl["components"]
+        cpn_nms = set([])
+        for k, cpn in self.components.items():
+            cpn_nms.add(cpn["obj"]["component_name"])
+
+        assert "Begin" in cpn_nms, "There have to be an 'Begin' component."
+        assert "Answer" in cpn_nms, "There have to be an 'Answer' component."
+
+        for k, cpn in self.components.items():
+            cpn_nms.add(cpn["obj"]["component_name"])
+            param = component_class(cpn["obj"]["component_name"] + "Param")()
+            param.update(cpn["obj"]["params"])
+            param.check()
+            cpn["obj"] = component_class(cpn["obj"]["component_name"])(self, k, param)
+            if cpn["obj"].component_name == "Categorize":
+                for _, desc in param.category_description.items():
+                    if desc["to"] not in cpn["downstream"]:
+                        cpn["downstream"].append(desc["to"])
+
+        self.path = self.dsl["path"]
+        self.history = self.dsl["history"]
+        self.messages = self.dsl["messages"]
+        self.answer = self.dsl["answer"]
+        self.reference = self.dsl["reference"]
+        self._embed_id = self.dsl.get("embed_id", "")
+
+    def __str__(self):
+        self.dsl["path"] = self.path
+        self.dsl["history"] = self.history
+        self.dsl["messages"] = self.messages
+        self.dsl["answer"] = self.answer
+        self.dsl["reference"] = self.reference
+        self.dsl["embed_id"] = self._embed_id
+        dsl = {
+            "components": {}
+        }
+        for k in self.dsl.keys():
+            if k in ["components"]:
+                continue
+            dsl[k] = deepcopy(self.dsl[k])
+
+        for k, cpn in self.components.items():
+            if k not in dsl["components"]:
+                dsl["components"][k] = {}
+            for c in cpn.keys():
+                if c == "obj":
+                    dsl["components"][k][c] = json.loads(str(cpn["obj"]))
+                    continue
+                dsl["components"][k][c] = deepcopy(cpn[c])
+        return json.dumps(dsl, ensure_ascii=False)
+
+    def reset(self):
+        self.path = []
+        self.history = []
+        self.messages = []
+        self.answer = []
+        self.reference = []
+        for k, cpn in self.components.items():
+            self.components[k]["obj"].reset()
+        self._embed_id = ""
+
+    def get_component_name(self, cid):
+        for n in self.dsl["graph"]["nodes"]:
+            if cid == n["id"]:
+                return n["data"]["name"]
+        return ""
+
+    def run(self, **kwargs):
+        if self.answer:
+            cpn_id = self.answer[0]
+            self.answer.pop(0)
+            try:
+                ans = self.components[cpn_id]["obj"].run(self.history, **kwargs)
+            except Exception as e:
+                ans = ComponentBase.be_output(str(e))
+            self.path[-1].append(cpn_id)
+            if kwargs.get("stream"):
+                for an in ans():
+                    yield an
+            else:
+                yield ans
+            return
+
+        if not self.path:
+            self.components["begin"]["obj"].run(self.history, **kwargs)
+            self.path.append(["begin"])
+
+        self.path.append([])
+
+        ran = -1
+        waiting = []
+        without_dependent_checking = []
+
+        def prepare2run(cpns):
+            nonlocal ran, ans
+            for c in cpns:
+                if self.path[-1] and c == self.path[-1][-1]:
+                    continue
+                cpn = self.components[c]["obj"]
+                if cpn.component_name == "Answer":
+                    self.answer.append(c)
+                else:
+                    logging.debug(f"Canvas.prepare2run: {c}")
+                    if c not in without_dependent_checking:
+                        cpids = cpn.get_dependent_components()
+                        if any([cc not in self.path[-1] for cc in cpids]):
+                            if c not in waiting:
+                                waiting.append(c)
+                            continue
+                    yield "*'{}'* is running...🕞".format(self.get_component_name(c))
+
+                    if cpn.component_name.lower() == "iteration":
+                        st_cpn = cpn.get_start()
+                        assert st_cpn, "Start component not found for Iteration."
+                        if not st_cpn["obj"].end():
+                            cpn = st_cpn["obj"]
+                            c = cpn._id
+
+                    try:
+                        ans = cpn.run(self.history, **kwargs)
+                    except Exception as e:
+                        logging.exception(f"Canvas.run got exception: {e}")
+                        self.path[-1].append(c)
+                        ran += 1
+                        raise e
+                    self.path[-1].append(c)
+
+            ran += 1
+
+        downstream = self.components[self.path[-2][-1]]["downstream"]
+        if not downstream and self.components[self.path[-2][-1]].get("parent_id"):
+            cid = self.path[-2][-1]
+            pid = self.components[cid]["parent_id"]
+            o, _ = self.components[cid]["obj"].output(allow_partial=False)
+            oo, _ = self.components[pid]["obj"].output(allow_partial=False)
+            self.components[pid]["obj"].set(pd.concat([oo, o], ignore_index=True))
+            downstream = [pid]
+
+        for m in prepare2run(downstream):
+            yield {"content": m, "running_status": True}
+
+        while 0 <= ran < len(self.path[-1]):
+            logging.debug(f"Canvas.run: {ran} {self.path}")
+            cpn_id = self.path[-1][ran]
+            cpn = self.get_component(cpn_id)
+            if not any([cpn["downstream"], cpn.get("parent_id"), waiting]):
+                break
+
+            loop = self._find_loop()
+            if loop:
+                raise OverflowError(f"Too much loops: {loop}")
+
+            if cpn["obj"].component_name.lower() in ["switch", "categorize", "relevant"]:
+                switch_out = cpn["obj"].output()[1].iloc[0, 0]
+                assert switch_out in self.components, \
+                    "{}'s output: {} not valid.".format(cpn_id, switch_out)
+                for m in prepare2run([switch_out]):
+                    yield {"content": m, "running_status": True}
+                continue
+
+            downstream = cpn["downstream"]
+            if not downstream and cpn.get("parent_id"):
+                pid = cpn["parent_id"]
+                _, o = cpn["obj"].output(allow_partial=False)
+                _, oo = self.components[pid]["obj"].output(allow_partial=False)
+                self.components[pid]["obj"].set_output(pd.concat([oo.dropna(axis=1), o.dropna(axis=1)], ignore_index=True))
+                downstream = [pid]
+
+            for m in prepare2run(downstream):
+                yield {"content": m, "running_status": True}
+
+            if ran >= len(self.path[-1]) and waiting:
+                without_dependent_checking = waiting
+                waiting = []
+                for m in prepare2run(without_dependent_checking):
+                    yield {"content": m, "running_status": True}
+                without_dependent_checking = []
+                ran -= 1
+
+        if self.answer:
+            cpn_id = self.answer[0]
+            self.answer.pop(0)
+            ans = self.components[cpn_id]["obj"].run(self.history, **kwargs)
+            self.path[-1].append(cpn_id)
+            if kwargs.get("stream"):
+                assert isinstance(ans, partial)
+                for an in ans():
+                    yield an
+            else:
+                yield ans
+
+        else:
+            raise Exception("The dialog flow has no way to interact with you. Please add an 'Interact' component to the end of the flow.")
+
+    def get_component(self, cpn_id):
+        return self.components[cpn_id]
+
+    def get_tenant_id(self):
+        return self._tenant_id
+
+    def get_history(self, window_size):
+        convs = []
+        for role, obj in self.history[window_size * -1:]:
+            if isinstance(obj, list) and obj and all([isinstance(o, dict) for o in obj]):
+                convs.append({"role": role, "content": '\n'.join([str(s.get("content", "")) for s in obj])})
+            else:
+                convs.append({"role": role, "content": str(obj)})
+        return convs
+
+    def add_user_input(self, question):
+        self.history.append(("user", question))
+
+    def set_embedding_model(self, embed_id):
+        self._embed_id = embed_id
+
+    def get_embedding_model(self):
+        return self._embed_id
+
+    def _find_loop(self, max_loops=6):
+        path = self.path[-1][::-1]
+        if len(path) < 2:
+            return False
+
+        for i in range(len(path)):
+            if path[i].lower().find("answer") == 0 or path[i].lower().find("iterationitem") == 0:
+                path = path[:i]
+                break
+
+        if len(path) < 2:
+            return False
+
+        for loc in range(2, len(path) // 2):
+            pat = ",".join(path[0:loc])
+            path_str = ",".join(path)
+            if len(pat) >= len(path_str):
+                return False
+            loop = max_loops
+            while path_str.find(pat) == 0 and loop >= 0:
+                loop -= 1
+                if len(pat)+1 >= len(path_str):
+                    return False
+                path_str = path_str[len(pat)+1:]
+            if loop < 0:
+                pat = " => ".join([p.split(":")[0] for p in path[0:loc]])
+                return pat + " => " + pat
+
+        return False
+
+    def get_prologue(self):
+        return self.components["begin"]["obj"]._param.prologue
+
+    def set_global_param(self, **kwargs):
+        for k, v in kwargs.items():
+            for q in self.components["begin"]["obj"]._param.query:
+                if k != q["key"]:
+                    continue
+                q["value"] = v
+
+    def get_preset_param(self):
+        return self.components["begin"]["obj"]._param.query
+
+    def get_component_input_elements(self, cpnnm):
+        return self.components[cpnnm]["obj"].get_input_elements()
--- a/agent/component/init.py
+++ b/agent/component/init.py
@ -0,0 +1,133 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import importlib
+from .begin import Begin, BeginParam
+from .generate import Generate, GenerateParam
+from .retrieval import Retrieval, RetrievalParam
+from .answer import Answer, AnswerParam
+from .categorize import Categorize, CategorizeParam
+from .switch import Switch, SwitchParam
+from .relevant import Relevant, RelevantParam
+from .message import Message, MessageParam
+from .rewrite import RewriteQuestion, RewriteQuestionParam
+from .keyword import KeywordExtract, KeywordExtractParam
+from .concentrator import Concentrator, ConcentratorParam
+from .baidu import Baidu, BaiduParam
+from .duckduckgo import DuckDuckGo, DuckDuckGoParam
+from .wikipedia import Wikipedia, WikipediaParam
+from .pubmed import PubMed, PubMedParam
+from .arxiv import ArXiv, ArXivParam
+from .google import Google, GoogleParam
+from .bing import Bing, BingParam
+from .googlescholar import GoogleScholar, GoogleScholarParam
+from .deepl import DeepL, DeepLParam
+from .github import GitHub, GitHubParam
+from .baidufanyi import BaiduFanyi, BaiduFanyiParam
+from .qweather import QWeather, QWeatherParam
+from .exesql import ExeSQL, ExeSQLParam
+from .yahoofinance import YahooFinance, YahooFinanceParam
+from .wencai import WenCai, WenCaiParam
+from .jin10 import Jin10, Jin10Param
+from .tushare import TuShare, TuShareParam
+from .akshare import AkShare, AkShareParam
+from .crawler import Crawler, CrawlerParam
+from .invoke import Invoke, InvokeParam
+from .template import Template, TemplateParam
+from .email import Email, EmailParam
+from .iteration import Iteration, IterationParam
+from .iterationitem import IterationItem, IterationItemParam
+
+
+def component_class(class_name):
+    m = importlib.import_module("agent.component")
+    c = getattr(m, class_name)
+    return c
+
+
+__all__ = [
+    "Begin",
+    "BeginParam",
+    "Generate",
+    "GenerateParam",
+    "Retrieval",
+    "RetrievalParam",
+    "Answer",
+    "AnswerParam",
+    "Categorize",
+    "CategorizeParam",
+    "Switch",
+    "SwitchParam",
+    "Relevant",
+    "RelevantParam",
+    "Message",
+    "MessageParam",
+    "RewriteQuestion",
+    "RewriteQuestionParam",
+    "KeywordExtract",
+    "KeywordExtractParam",
+    "Concentrator",
+    "ConcentratorParam",
+    "Baidu",
+    "BaiduParam",
+    "DuckDuckGo",
+    "DuckDuckGoParam",
+    "Wikipedia",
+    "WikipediaParam",
+    "PubMed",
+    "PubMedParam",
+    "ArXiv",
+    "ArXivParam",
+    "Google",
+    "GoogleParam",
+    "Bing",
+    "BingParam",
+    "GoogleScholar",
+    "GoogleScholarParam",
+    "DeepL",
+    "DeepLParam",
+    "GitHub",
+    "GitHubParam",
+    "BaiduFanyi",
+    "BaiduFanyiParam",
+    "QWeather",
+    "QWeatherParam",
+    "ExeSQL",
+    "ExeSQLParam",
+    "YahooFinance",
+    "YahooFinanceParam",
+    "WenCai",
+    "WenCaiParam",
+    "Jin10",
+    "Jin10Param",
+    "TuShare",
+    "TuShareParam",
+    "AkShare",
+    "AkShareParam",
+    "Crawler",
+    "CrawlerParam",
+    "Invoke",
+    "InvokeParam",
+    "Iteration",
+    "IterationParam",
+    "IterationItem",
+    "IterationItemParam",
+    "Template",
+    "TemplateParam",
+    "Email",
+    "EmailParam",
+    "component_class"
+]
--- a/agent/component/akshare.py
+++ b/agent/component/akshare.py
@ -0,0 +1,56 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class AkShareParam(ComponentParamBase):
+    """
+    Define the AkShare component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+
+
+class AkShare(ComponentBase, ABC):
+    component_name = "AkShare"
+
+    def _run(self, history, **kwargs):
+        import akshare as ak
+        ans = self.get_input()
+        ans = ",".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return AkShare.be_output("")
+
+        try:
+            ak_res = []
+            stock_news_em_df = ak.stock_news_em(symbol=ans)
+            stock_news_em_df = stock_news_em_df.head(self._param.top_n)
+            ak_res = [{"content": '<a href="' + i["新闻链接"] + '">' + i["新闻标题"] + '</a>\n 新闻内容: ' + i[
+                "新闻内容"] + " \n发布时间:" + i["发布时间"] + " \n文章来源: " + i["文章来源"]} for index, i in stock_news_em_df.iterrows()]
+        except Exception as e:
+            return AkShare.be_output("**ERROR**: " + str(e))
+
+        if not ak_res:
+            return AkShare.be_output("")
+
+        return pd.DataFrame(ak_res)
--- a/agent/component/answer.py
+++ b/agent/component/answer.py
@ -0,0 +1,89 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import random
+from abc import ABC
+from functools import partial
+from typing import Tuple, Union
+
+import pandas as pd
+
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class AnswerParam(ComponentParamBase):
+
+    """
+    Define the Answer component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.post_answers = []
+
+    def check(self):
+        return True
+
+
+class Answer(ComponentBase, ABC):
+    component_name = "Answer"
+
+    def _run(self, history, **kwargs):
+        if kwargs.get("stream"):
+            return partial(self.stream_output)
+
+        ans = self.get_input()
+        if self._param.post_answers:
+            ans = pd.concat([ans, pd.DataFrame([{"content": random.choice(self._param.post_answers)}])], ignore_index=False)
+        return ans
+
+    def stream_output(self):
+        res = None
+        if hasattr(self, "exception") and self.exception:
+            res = {"content": str(self.exception)}
+            self.exception = None
+            yield res
+            self.set_output(res)
+            return
+
+        stream = self.get_stream_input()
+        if isinstance(stream, pd.DataFrame):
+            res = stream
+            answer = ""
+            for ii, row in stream.iterrows():
+                answer += row.to_dict()["content"]
+                yield {"content": answer}
+        else:
+            for st in stream():
+                res = st
+                yield st
+        if self._param.post_answers:
+            res["content"] += random.choice(self._param.post_answers)
+            yield res
+
+        self.set_output(res)
+
+    def set_exception(self, e):
+        self.exception = e
+
+    def output(self, allow_partial=True) -> Tuple[str, Union[pd.DataFrame, partial]]:
+        if allow_partial:
+            return super.output()
+
+        for r, c in self._canvas.history[::-1]:
+            if r == "user":
+                return self._param.output_var_name, pd.DataFrame([{"content": c}])
+
+        self._param.output_var_name, pd.DataFrame([])
+
--- a/agent/component/arxiv.py
+++ b/agent/component/arxiv.py
@ -0,0 +1,68 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import arxiv
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+class ArXivParam(ComponentParamBase):
+    """
+    Define the ArXiv component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 6
+        self.sort_by = 'submittedDate'
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.sort_by, "ArXiv Search Sort_by",
+                               ['submittedDate', 'lastUpdatedDate', 'relevance'])
+
+
+class ArXiv(ComponentBase, ABC):
+    component_name = "ArXiv"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return ArXiv.be_output("")
+
+        try:
+            sort_choices = {"relevance": arxiv.SortCriterion.Relevance,
+                            "lastUpdatedDate": arxiv.SortCriterion.LastUpdatedDate,
+                            'submittedDate': arxiv.SortCriterion.SubmittedDate}
+            arxiv_client = arxiv.Client()
+            search = arxiv.Search(
+                query=ans,
+                max_results=self._param.top_n,
+                sort_by=sort_choices[self._param.sort_by]
+            )
+            arxiv_res = [
+                {"content": 'Title: ' + i.title + '\nPdf_Url: <a href="' + i.pdf_url + '"></a> \nSummary: ' + i.summary} for
+                i in list(arxiv_client.results(search))]
+        except Exception as e:
+            return ArXiv.be_output("**ERROR**: " + str(e))
+
+        if not arxiv_res:
+            return ArXiv.be_output("")
+
+        df = pd.DataFrame(arxiv_res)
+        logging.debug(f"df: {str(df)}")
+        return df
--- a/agent/component/baidu.py
+++ b/agent/component/baidu.py
@ -0,0 +1,67 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import pandas as pd
+import requests
+import re
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class BaiduParam(ComponentParamBase):
+    """
+    Define the Baidu component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+
+
+class Baidu(ComponentBase, ABC):
+    component_name = "Baidu"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return Baidu.be_output("")
+
+        try:
+            url = 'http://www.baidu.com/s?wd=' + ans + '&rn=' + str(self._param.top_n)
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'}
+            response = requests.get(url=url, headers=headers)
+
+            url_res = re.findall(r"'url': \\\"(.*?)\\\"}", response.text)
+            title_res = re.findall(r"'title': \\\"(.*?)\\\",\\n", response.text)
+            body_res = re.findall(r"\"contentText\":\"(.*?)\"", response.text)
+            baidu_res = [{"content": re.sub('<em>|</em>', '', '<a href="' + url + '">' + title + '</a>    ' + body)} for
+                         url, title, body in zip(url_res, title_res, body_res)]
+            del body_res, url_res, title_res
+        except Exception as e:
+            return Baidu.be_output("**ERROR**: " + str(e))
+
+        if not baidu_res:
+            return Baidu.be_output("")
+
+        df = pd.DataFrame(baidu_res)
+        logging.debug(f"df: {str(df)}")
+        return df
+
--- a/agent/component/baidufanyi.py
+++ b/agent/component/baidufanyi.py
@ -0,0 +1,96 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import random
+from abc import ABC
+import requests
+from agent.component.base import ComponentBase, ComponentParamBase
+from hashlib import md5
+
+
+class BaiduFanyiParam(ComponentParamBase):
+    """
+    Define the BaiduFanyi component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.appid = "xxx"
+        self.secret_key = "xxx"
+        self.trans_type = 'translate'
+        self.parameters = []
+        self.source_lang = 'auto'
+        self.target_lang = 'auto'
+        self.domain = 'finance'
+
+    def check(self):
+        self.check_empty(self.appid, "BaiduFanyi APPID")
+        self.check_empty(self.secret_key, "BaiduFanyi Secret Key")
+        self.check_valid_value(self.trans_type, "Translate type", ['translate', 'fieldtranslate'])
+        self.check_valid_value(self.source_lang, "Source language",
+                               ['auto', 'zh', 'en', 'yue', 'wyw', 'jp', 'kor', 'fra', 'spa', 'th', 'ara', 'ru', 'pt',
+                                'de', 'it', 'el', 'nl', 'pl', 'bul', 'est', 'dan', 'fin', 'cs', 'rom', 'slo', 'swe',
+                                'hu', 'cht', 'vie'])
+        self.check_valid_value(self.target_lang, "Target language",
+                               ['auto', 'zh', 'en', 'yue', 'wyw', 'jp', 'kor', 'fra', 'spa', 'th', 'ara', 'ru', 'pt',
+                                'de', 'it', 'el', 'nl', 'pl', 'bul', 'est', 'dan', 'fin', 'cs', 'rom', 'slo', 'swe',
+                                'hu', 'cht', 'vie'])
+        self.check_valid_value(self.domain, "Translate field",
+                               ['it', 'finance', 'machinery', 'senimed', 'novel', 'academic', 'aerospace', 'wiki',
+                                'news', 'law', 'contract'])
+
+
+class BaiduFanyi(ComponentBase, ABC):
+    component_name = "BaiduFanyi"
+
+    def _run(self, history, **kwargs):
+
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return BaiduFanyi.be_output("")
+
+        try:
+            source_lang = self._param.source_lang
+            target_lang = self._param.target_lang
+            appid = self._param.appid
+            salt = random.randint(32768, 65536)
+            secret_key = self._param.secret_key
+
+            if self._param.trans_type == 'translate':
+                sign = md5((appid + ans + salt + secret_key).encode('utf-8')).hexdigest()
+                url = 'http://api.fanyi.baidu.com/api/trans/vip/translate?' + 'q=' + ans + '&from=' + source_lang + '&to=' + target_lang + '&appid=' + appid + '&salt=' + salt + '&sign=' + sign
+                headers = {"Content-Type": "application/x-www-form-urlencoded"}
+                response = requests.post(url=url, headers=headers).json()
+
+                if response.get('error_code'):
+                    BaiduFanyi.be_output("**Error**:" + response['error_msg'])
+
+                return BaiduFanyi.be_output(response['trans_result'][0]['dst'])
+            elif self._param.trans_type == 'fieldtranslate':
+                domain = self._param.domain
+                sign = md5((appid + ans + salt + domain + secret_key).encode('utf-8')).hexdigest()
+                url = 'http://api.fanyi.baidu.com/api/trans/vip/fieldtranslate?' + 'q=' + ans + '&from=' + source_lang + '&to=' + target_lang + '&appid=' + appid + '&salt=' + salt + '&domain=' + domain + '&sign=' + sign
+                headers = {"Content-Type": "application/x-www-form-urlencoded"}
+                response = requests.post(url=url, headers=headers).json()
+
+                if response.get('error_code'):
+                    BaiduFanyi.be_output("**Error**:" + response['error_msg'])
+
+                return BaiduFanyi.be_output(response['trans_result'][0]['dst'])
+
+        except Exception as e:
+            BaiduFanyi.be_output("**Error**:" + str(e))
+    
--- a/agent/component/base.py
+++ b/agent/component/base.py
@ -0,0 +1,586 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import builtins
+import json
+import os
+import logging
+from functools import partial
+from typing import Tuple, Union
+
+import pandas as pd
+
+from agent import settings
+
+_FEEDED_DEPRECATED_PARAMS = "_feeded_deprecated_params"
+_DEPRECATED_PARAMS = "_deprecated_params"
+_USER_FEEDED_PARAMS = "_user_feeded_params"
+_IS_RAW_CONF = "_is_raw_conf"
+
+
+class ComponentParamBase(ABC):
+    def __init__(self):
+        self.output_var_name = "output"
+        self.message_history_window_size = 22
+        self.query = []
+        self.inputs = []
+        self.debug_inputs = []
+
+    def set_name(self, name: str):
+        self._name = name
+        return self
+
+    def check(self):
+        raise NotImplementedError("Parameter Object should be checked.")
+
+    @classmethod
+    def _get_or_init_deprecated_params_set(cls):
+        if not hasattr(cls, _DEPRECATED_PARAMS):
+            setattr(cls, _DEPRECATED_PARAMS, set())
+        return getattr(cls, _DEPRECATED_PARAMS)
+
+    def _get_or_init_feeded_deprecated_params_set(self, conf=None):
+        if not hasattr(self, _FEEDED_DEPRECATED_PARAMS):
+            if conf is None:
+                setattr(self, _FEEDED_DEPRECATED_PARAMS, set())
+            else:
+                setattr(
+                    self,
+                    _FEEDED_DEPRECATED_PARAMS,
+                    set(conf[_FEEDED_DEPRECATED_PARAMS]),
+                )
+        return getattr(self, _FEEDED_DEPRECATED_PARAMS)
+
+    def _get_or_init_user_feeded_params_set(self, conf=None):
+        if not hasattr(self, _USER_FEEDED_PARAMS):
+            if conf is None:
+                setattr(self, _USER_FEEDED_PARAMS, set())
+            else:
+                setattr(self, _USER_FEEDED_PARAMS, set(conf[_USER_FEEDED_PARAMS]))
+        return getattr(self, _USER_FEEDED_PARAMS)
+
+    def get_user_feeded(self):
+        return self._get_or_init_user_feeded_params_set()
+
+    def get_feeded_deprecated_params(self):
+        return self._get_or_init_feeded_deprecated_params_set()
+
+    @property
+    def _deprecated_params_set(self):
+        return {name: True for name in self.get_feeded_deprecated_params()}
+
+    def __str__(self):
+        return json.dumps(self.as_dict(), ensure_ascii=False)
+
+    def as_dict(self):
+        def _recursive_convert_obj_to_dict(obj):
+            ret_dict = {}
+            for attr_name in list(obj.__dict__):
+                if attr_name in [_FEEDED_DEPRECATED_PARAMS, _DEPRECATED_PARAMS, _USER_FEEDED_PARAMS, _IS_RAW_CONF]:
+                    continue
+                # get attr
+                attr = getattr(obj, attr_name)
+                if isinstance(attr, pd.DataFrame):
+                    ret_dict[attr_name] = attr.to_dict()
+                    continue
+                if attr and type(attr).__name__ not in dir(builtins):
+                    ret_dict[attr_name] = _recursive_convert_obj_to_dict(attr)
+                else:
+                    ret_dict[attr_name] = attr
+
+            return ret_dict
+
+        return _recursive_convert_obj_to_dict(self)
+
+    def update(self, conf, allow_redundant=False):
+        update_from_raw_conf = conf.get(_IS_RAW_CONF, True)
+        if update_from_raw_conf:
+            deprecated_params_set = self._get_or_init_deprecated_params_set()
+            feeded_deprecated_params_set = (
+                self._get_or_init_feeded_deprecated_params_set()
+            )
+            user_feeded_params_set = self._get_or_init_user_feeded_params_set()
+            setattr(self, _IS_RAW_CONF, False)
+        else:
+            feeded_deprecated_params_set = (
+                self._get_or_init_feeded_deprecated_params_set(conf)
+            )
+            user_feeded_params_set = self._get_or_init_user_feeded_params_set(conf)
+
+        def _recursive_update_param(param, config, depth, prefix):
+            if depth > settings.PARAM_MAXDEPTH:
+                raise ValueError("Param define nesting too deep!!!, can not parse it")
+
+            inst_variables = param.__dict__
+            redundant_attrs = []
+            for config_key, config_value in config.items():
+                # redundant attr
+                if config_key not in inst_variables:
+                    if not update_from_raw_conf and config_key.startswith("_"):
+                        setattr(param, config_key, config_value)
+                    else:
+                        setattr(param, config_key, config_value)
+                        # redundant_attrs.append(config_key)
+                    continue
+
+                full_config_key = f"{prefix}{config_key}"
+
+                if update_from_raw_conf:
+                    # add user feeded params
+                    user_feeded_params_set.add(full_config_key)
+
+                    # update user feeded deprecated param set
+                    if full_config_key in deprecated_params_set:
+                        feeded_deprecated_params_set.add(full_config_key)
+
+                # supported attr
+                attr = getattr(param, config_key)
+                if type(attr).__name__ in dir(builtins) or attr is None:
+                    setattr(param, config_key, config_value)
+
+                else:
+                    # recursive set obj attr
+                    sub_params = _recursive_update_param(
+                        attr, config_value, depth + 1, prefix=f"{prefix}{config_key}."
+                    )
+                    setattr(param, config_key, sub_params)
+
+            if not allow_redundant and redundant_attrs:
+                raise ValueError(
+                    f"cpn `{getattr(self, '_name', type(self))}` has redundant parameters: `{[redundant_attrs]}`"
+                )
+
+            return param
+
+        return _recursive_update_param(param=self, config=conf, depth=0, prefix="")
+
+    def extract_not_builtin(self):
+        def _get_not_builtin_types(obj):
+            ret_dict = {}
+            for variable in obj.__dict__:
+                attr = getattr(obj, variable)
+                if attr and type(attr).__name__ not in dir(builtins):
+                    ret_dict[variable] = _get_not_builtin_types(attr)
+
+            return ret_dict
+
+        return _get_not_builtin_types(self)
+
+    def validate(self):
+        self.builtin_types = dir(builtins)
+        self.func = {
+            "ge": self._greater_equal_than,
+            "le": self._less_equal_than,
+            "in": self._in,
+            "not_in": self._not_in,
+            "range": self._range,
+        }
+        home_dir = os.path.abspath(os.path.dirname(os.path.realpath(__file__)))
+        param_validation_path_prefix = home_dir + "/param_validation/"
+
+        param_name = type(self).__name__
+        param_validation_path = "/".join(
+            [param_validation_path_prefix, param_name + ".json"]
+        )
+
+        validation_json = None
+
+        try:
+            with open(param_validation_path, "r") as fin:
+                validation_json = json.loads(fin.read())
+        except BaseException:
+            return
+
+        self._validate_param(self, validation_json)
+
+    def _validate_param(self, param_obj, validation_json):
+        default_section = type(param_obj).__name__
+        var_list = param_obj.__dict__
+
+        for variable in var_list:
+            attr = getattr(param_obj, variable)
+
+            if type(attr).__name__ in self.builtin_types or attr is None:
+                if variable not in validation_json:
+                    continue
+
+                validation_dict = validation_json[default_section][variable]
+                value = getattr(param_obj, variable)
+                value_legal = False
+
+                for op_type in validation_dict:
+                    if self.func[op_type](value, validation_dict[op_type]):
+                        value_legal = True
+                        break
+
+                if not value_legal:
+                    raise ValueError(
+                        "Plase check runtime conf, {} = {} does not match user-parameter restriction".format(
+                            variable, value
+                        )
+                    )
+
+            elif variable in validation_json:
+                self._validate_param(attr, validation_json)
+
+    @staticmethod
+    def check_string(param, descr):
+        if type(param).__name__ not in ["str"]:
+            raise ValueError(
+                descr + " {} not supported, should be string type".format(param)
+            )
+
+    @staticmethod
+    def check_empty(param, descr):
+        if not param:
+            raise ValueError(
+                descr + " does not support empty value."
+            )
+
+    @staticmethod
+    def check_positive_integer(param, descr):
+        if type(param).__name__ not in ["int", "long"] or param <= 0:
+            raise ValueError(
+                descr + " {} not supported, should be positive integer".format(param)
+            )
+
+    @staticmethod
+    def check_positive_number(param, descr):
+        if type(param).__name__ not in ["float", "int", "long"] or param <= 0:
+            raise ValueError(
+                descr + " {} not supported, should be positive numeric".format(param)
+            )
+
+    @staticmethod
+    def check_nonnegative_number(param, descr):
+        if type(param).__name__ not in ["float", "int", "long"] or param < 0:
+            raise ValueError(
+                descr
+                + " {} not supported, should be non-negative numeric".format(param)
+            )
+
+    @staticmethod
+    def check_decimal_float(param, descr):
+        if type(param).__name__ not in ["float", "int"] or param < 0 or param > 1:
+            raise ValueError(
+                descr
+                + " {} not supported, should be a float number in range [0, 1]".format(
+                    param
+                )
+            )
+
+    @staticmethod
+    def check_boolean(param, descr):
+        if type(param).__name__ != "bool":
+            raise ValueError(
+                descr + " {} not supported, should be bool type".format(param)
+            )
+
+    @staticmethod
+    def check_open_unit_interval(param, descr):
+        if type(param).__name__ not in ["float"] or param <= 0 or param >= 1:
+            raise ValueError(
+                descr + " should be a numeric number between 0 and 1 exclusively"
+            )
+
+    @staticmethod
+    def check_valid_value(param, descr, valid_values):
+        if param not in valid_values:
+            raise ValueError(
+                descr
+                + " {} is not supported, it should be in {}".format(param, valid_values)
+            )
+
+    @staticmethod
+    def check_defined_type(param, descr, types):
+        if type(param).__name__ not in types:
+            raise ValueError(
+                descr + " {} not supported, should be one of {}".format(param, types)
+            )
+
+    @staticmethod
+    def check_and_change_lower(param, valid_list, descr=""):
+        if type(param).__name__ != "str":
+            raise ValueError(
+                descr
+                + " {} not supported, should be one of {}".format(param, valid_list)
+            )
+
+        lower_param = param.lower()
+        if lower_param in valid_list:
+            return lower_param
+        else:
+            raise ValueError(
+                descr
+                + " {} not supported, should be one of {}".format(param, valid_list)
+            )
+
+    @staticmethod
+    def _greater_equal_than(value, limit):
+        return value >= limit - settings.FLOAT_ZERO
+
+    @staticmethod
+    def _less_equal_than(value, limit):
+        return value <= limit + settings.FLOAT_ZERO
+
+    @staticmethod
+    def _range(value, ranges):
+        in_range = False
+        for left_limit, right_limit in ranges:
+            if (
+                    left_limit - settings.FLOAT_ZERO
+                    <= value
+                    <= right_limit + settings.FLOAT_ZERO
+            ):
+                in_range = True
+                break
+
+        return in_range
+
+    @staticmethod
+    def _in(value, right_value_list):
+        return value in right_value_list
+
+    @staticmethod
+    def _not_in(value, wrong_value_list):
+        return value not in wrong_value_list
+
+    def _warn_deprecated_param(self, param_name, descr):
+        if self._deprecated_params_set.get(param_name):
+            logging.warning(
+                f"{descr} {param_name} is deprecated and ignored in this version."
+            )
+
+    def _warn_to_deprecate_param(self, param_name, descr, new_param):
+        if self._deprecated_params_set.get(param_name):
+            logging.warning(
+                f"{descr} {param_name} will be deprecated in future release; "
+                f"please use {new_param} instead."
+            )
+            return True
+        return False
+
+
+class ComponentBase(ABC):
+    component_name: str
+
+    def __str__(self):
+        """
+        {
+            "component_name": "Begin",
+            "params": {}
+        }
+        """
+        return """{{
+            "component_name": "{}",
+            "params": {},
+            "output": {},
+            "inputs": {}
+        }}""".format(self.component_name,
+                     self._param,
+                     json.dumps(json.loads(str(self._param)).get("output", {}), ensure_ascii=False),
+                     json.dumps(json.loads(str(self._param)).get("inputs", []), ensure_ascii=False)
+        )
+
+    def __init__(self, canvas, id, param: ComponentParamBase):
+        self._canvas = canvas
+        self._id = id
+        self._param = param
+        self._param.check()
+
+    def get_dependent_components(self):
+        cpnts = set([para["component_id"].split("@")[0] for para in self._param.query \
+                     if para.get("component_id") \
+                     and para["component_id"].lower().find("answer") < 0 \
+                     and para["component_id"].lower().find("begin") < 0])
+        return list(cpnts)
+
+    def run(self, history, **kwargs):
+        logging.debug("{}, history: {}, kwargs: {}".format(self, json.dumps(history, ensure_ascii=False),
+                                                              json.dumps(kwargs, ensure_ascii=False)))
+        self._param.debug_inputs = []
+        try:
+            res = self._run(history, **kwargs)
+            self.set_output(res)
+        except Exception as e:
+            self.set_output(pd.DataFrame([{"content": str(e)}]))
+            raise e
+
+        return res
+
+    def _run(self, history, **kwargs):
+        raise NotImplementedError()
+
+    def output(self, allow_partial=True) -> Tuple[str, Union[pd.DataFrame, partial]]:
+        o = getattr(self._param, self._param.output_var_name)
+        if not isinstance(o, partial):
+            if not isinstance(o, pd.DataFrame):
+                if isinstance(o, list):
+                    return self._param.output_var_name, pd.DataFrame(o)
+                if o is None:
+                    return self._param.output_var_name, pd.DataFrame()
+                return self._param.output_var_name, pd.DataFrame([{"content": str(o)}])
+            return self._param.output_var_name, o
+
+        if allow_partial or not isinstance(o, partial):
+            if not isinstance(o, partial) and not isinstance(o, pd.DataFrame):
+                return pd.DataFrame(o if isinstance(o, list) else [o])
+            return self._param.output_var_name, o
+
+        outs = None
+        for oo in o():
+            if not isinstance(oo, pd.DataFrame):
+                outs = pd.DataFrame(oo if isinstance(oo, list) else [oo])
+            else:
+                outs = oo
+        return self._param.output_var_name, outs
+
+    def reset(self):
+        setattr(self._param, self._param.output_var_name, None)
+        self._param.inputs = []
+
+    def set_output(self, v):
+        setattr(self._param, self._param.output_var_name, v)
+
+    def get_input(self):
+        if self._param.debug_inputs:
+            return pd.DataFrame([{"content": v["value"]} for v in self._param.debug_inputs if v.get("value")])
+
+        reversed_cpnts = []
+        if len(self._canvas.path) > 1:
+            reversed_cpnts.extend(self._canvas.path[-2])
+        reversed_cpnts.extend(self._canvas.path[-1])
+
+        if self._param.query:
+            self._param.inputs = []
+            outs = []
+            for q in self._param.query:
+                if q.get("component_id"):
+                    if q["component_id"].split("@")[0].lower().find("begin") >= 0:
+                        cpn_id, key = q["component_id"].split("@")
+                        for p in self._canvas.get_component(cpn_id)["obj"]._param.query:
+                            if p["key"] == key:
+                                outs.append(pd.DataFrame([{"content": p.get("value", "")}]))
+                                self._param.inputs.append({"component_id": q["component_id"],
+                                                           "content": p.get("value", "")})
+                                break
+                        else:
+                            assert False, f"Can't find parameter '{key}' for {cpn_id}"
+                        continue
+
+                    if q["component_id"].lower().find("answer") == 0:
+                        txt = []
+                        for r, c in self._canvas.history[::-1][:self._param.message_history_window_size][::-1]:
+                            txt.append(f"{r.upper()}: {c}")
+                        txt = "\n".join(txt)
+                        self._param.inputs.append({"content": txt, "component_id": q["component_id"]})
+                        outs.append(pd.DataFrame([{"content": txt}]))
+                        continue
+
+                    outs.append(self._canvas.get_component(q["component_id"])["obj"].output(allow_partial=False)[1])
+                    self._param.inputs.append({"component_id": q["component_id"],
+                                               "content": "\n".join(
+                                                   [str(d["content"]) for d in outs[-1].to_dict('records')])})
+                elif q.get("value"):
+                    self._param.inputs.append({"component_id": None, "content": q["value"]})
+                    outs.append(pd.DataFrame([{"content": q["value"]}]))
+            if outs:
+                df = pd.concat(outs, ignore_index=True)
+                if "content" in df:
+                    df = df.drop_duplicates(subset=['content']).reset_index(drop=True)
+                return df
+
+        upstream_outs = []
+
+        for u in reversed_cpnts[::-1]:
+            if self.get_component_name(u) in ["switch", "concentrator"]:
+                continue
+            if self.component_name.lower() == "generate" and self.get_component_name(u) == "retrieval":
+                o = self._canvas.get_component(u)["obj"].output(allow_partial=False)[1]
+                if o is not None:
+                    o["component_id"] = u
+                    upstream_outs.append(o)
+                    continue
+            #if self.component_name.lower()!="answer" and u not in self._canvas.get_component(self._id)["upstream"]: continue
+            if self.component_name.lower().find("switch") < 0 \
+                    and self.get_component_name(u) in ["relevant", "categorize"]:
+                continue
+            if u.lower().find("answer") >= 0:
+                for r, c in self._canvas.history[::-1]:
+                    if r == "user":
+                        upstream_outs.append(pd.DataFrame([{"content": c, "component_id": u}]))
+                        break
+                break
+            if self.component_name.lower().find("answer") >= 0 and self.get_component_name(u) in ["relevant"]:
+                continue
+            o = self._canvas.get_component(u)["obj"].output(allow_partial=False)[1]
+            if o is not None:
+                o["component_id"] = u
+                upstream_outs.append(o)
+            break
+
+        assert upstream_outs, "Can't inference the where the component input is. Please identify whose output is this component's input."
+
+        df = pd.concat(upstream_outs, ignore_index=True)
+        if "content" in df:
+            df = df.drop_duplicates(subset=['content']).reset_index(drop=True)
+
+        self._param.inputs = []
+        for _, r in df.iterrows():
+            self._param.inputs.append({"component_id": r["component_id"], "content": r["content"]})
+
+        return df
+
+    def get_input_elements(self):
+        assert self._param.query, "Please identify input parameters firstly."
+        eles = []
+        for q in self._param.query:
+            if q.get("component_id"):
+                cpn_id = q["component_id"]
+                if cpn_id.split("@")[0].lower().find("begin") >= 0:
+                    cpn_id, key = cpn_id.split("@")
+                    eles.extend(self._canvas.get_component(cpn_id)["obj"]._param.query)
+                    continue
+
+                eles.append({"name": self._canvas.get_component_name(cpn_id), "key": cpn_id})
+            else:
+                eles.append({"key": q["value"], "name": q["value"], "value": q["value"]})
+        return eles
+
+    def get_stream_input(self):
+        reversed_cpnts = []
+        if len(self._canvas.path) > 1:
+            reversed_cpnts.extend(self._canvas.path[-2])
+        reversed_cpnts.extend(self._canvas.path[-1])
+
+        for u in reversed_cpnts[::-1]:
+            if self.get_component_name(u) in ["switch", "answer"]:
+                continue
+            return self._canvas.get_component(u)["obj"].output()[1]
+
+    @staticmethod
+    def be_output(v):
+        return pd.DataFrame([{"content": v}])
+
+    def get_component_name(self, cpn_id):
+        return self._canvas.get_component(cpn_id)["obj"].component_name.lower()
+
+    def debug(self, **kwargs):
+        return self._run([], **kwargs)
+
+    def get_parent(self):
+        pid = self._canvas.get_component(self._id)["parent_id"]
+        return self._canvas.get_component(pid)["obj"]
--- a/agent/component/begin.py
+++ b/agent/component/begin.py
@ -0,0 +1,49 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from functools import partial
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class BeginParam(ComponentParamBase):
+
+    """
+    Define the Begin component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.prologue = "Hi! I'm your smart assistant. What can I do for you?"
+        self.query = []
+
+    def check(self):
+        return True
+
+
+class Begin(ComponentBase):
+    component_name = "Begin"
+
+    def _run(self, history, **kwargs):
+        if kwargs.get("stream"):
+            return partial(self.stream_output)
+        return pd.DataFrame([{"content": self._param.prologue}])
+
+    def stream_output(self):
+        res = {"content": self._param.prologue}
+        yield res
+        self.set_output(self.be_output(res))
+
+
+
--- a/agent/component/bing.py
+++ b/agent/component/bing.py
@ -0,0 +1,84 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import requests
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+class BingParam(ComponentParamBase):
+    """
+    Define the Bing component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+        self.channel = "Webpages"
+        self.api_key = "YOUR_ACCESS_KEY"
+        self.country = "CN"
+        self.language = "en"
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.channel, "Bing Web Search or Bing News", ["Webpages", "News"])
+        self.check_empty(self.api_key, "Bing subscription key")
+        self.check_valid_value(self.country, "Bing Country",
+                               ['AR', 'AU', 'AT', 'BE', 'BR', 'CA', 'CL', 'DK', 'FI', 'FR', 'DE', 'HK', 'IN', 'ID',
+                                'IT', 'JP', 'KR', 'MY', 'MX', 'NL', 'NZ', 'NO', 'CN', 'PL', 'PT', 'PH', 'RU', 'SA',
+                                'ZA', 'ES', 'SE', 'CH', 'TW', 'TR', 'GB', 'US'])
+        self.check_valid_value(self.language, "Bing Languages",
+                               ['ar', 'eu', 'bn', 'bg', 'ca', 'ns', 'nt', 'hr', 'cs', 'da', 'nl', 'en', 'gb', 'et',
+                                'fi', 'fr', 'gl', 'de', 'gu', 'he', 'hi', 'hu', 'is', 'it', 'jp', 'kn', 'ko', 'lv',
+                                'lt', 'ms', 'ml', 'mr', 'nb', 'pl', 'br', 'pt', 'pa', 'ro', 'ru', 'sr', 'sk', 'sl',
+                                'es', 'sv', 'ta', 'te', 'th', 'tr', 'uk', 'vi'])
+
+
+class Bing(ComponentBase, ABC):
+    component_name = "Bing"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return Bing.be_output("")
+
+        try:
+            headers = {"Ocp-Apim-Subscription-Key": self._param.api_key, 'Accept-Language': self._param.language}
+            params = {"q": ans, "textDecorations": True, "textFormat": "HTML", "cc": self._param.country,
+                      "answerCount": 1, "promote": self._param.channel}
+            if self._param.channel == "Webpages":
+                response = requests.get("https://api.bing.microsoft.com/v7.0/search", headers=headers, params=params)
+                response.raise_for_status()
+                search_results = response.json()
+                bing_res = [{"content": '<a href="' + i["url"] + '">' + i["name"] + '</a>    ' + i["snippet"]} for i in
+                            search_results["webPages"]["value"]]
+            elif self._param.channel == "News":
+                response = requests.get("https://api.bing.microsoft.com/v7.0/news/search", headers=headers,
+                                        params=params)
+                response.raise_for_status()
+                search_results = response.json()
+                bing_res = [{"content": '<a href="' + i["url"] + '">' + i["name"] + '</a>    ' + i["description"]} for i
+                            in search_results['news']['value']]
+        except Exception as e:
+            return Bing.be_output("**ERROR**: " + str(e))
+
+        if not bing_res:
+            return Bing.be_output("")
+
+        df = pd.DataFrame(bing_res)
+        logging.debug(f"df: {str(df)}")
+        return df
--- a/agent/component/categorize.py
+++ b/agent/component/categorize.py
@ -0,0 +1,98 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+from api.db import LLMType
+from api.db.services.llm_service import LLMBundle
+from agent.component import GenerateParam, Generate
+
+
+class CategorizeParam(GenerateParam):
+
+    """
+    Define the Categorize component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.category_description = {}
+        self.prompt = ""
+
+    def check(self):
+        super().check()
+        self.check_empty(self.category_description, "[Categorize] Category examples")
+        for k, v in self.category_description.items():
+            if not k:
+                raise ValueError("[Categorize] Category name can not be empty!")
+            if not v.get("to"):
+                raise ValueError(f"[Categorize] 'To' of category {k} can not be empty!")
+
+    def get_prompt(self, chat_hist):
+        cate_lines = []
+        for c, desc in self.category_description.items():
+            for line in desc.get("examples", "").split("\n"):
+                if not line:
+                    continue
+                cate_lines.append("USER: {}\nCategory: {}".format(line, c))
+        descriptions = []
+        for c, desc in self.category_description.items():
+            if desc.get("description"):
+                descriptions.append(
+                    "--------------------\nCategory: {}\nDescription: {}\n".format(c, desc["description"]))
+
+        self.prompt = """
+        You're a text classifier. You need to categorize the user’s questions into {} categories, 
+        namely: {}
+        Here's description of each category:
+        {}
+
+        You could learn from the following examples:
+        {}
+        You could learn from the above examples.
+        Just mention the category names, no need for any additional words.
+        
+        ---- Real Data ----
+        {}
+        """.format(
+            len(self.category_description.keys()),
+            "/".join(list(self.category_description.keys())),
+            "\n".join(descriptions),
+            "- ".join(cate_lines),
+            chat_hist
+        )
+        return self.prompt
+
+
+class Categorize(Generate, ABC):
+    component_name = "Categorize"
+
+    def _run(self, history, **kwargs):
+        input = self.get_input()
+        input = " - ".join(input["content"]) if "content" in input else ""
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        ans = chat_mdl.chat(self._param.get_prompt(input), [{"role": "user", "content": "\nCategory: "}],
+                            self._param.gen_conf())
+        logging.debug(f"input: {input}, answer: {str(ans)}")
+        for c in self._param.category_description.keys():
+            if ans.lower().find(c.lower()) >= 0:
+                return Categorize.be_output(self._param.category_description[c]["to"])
+
+        return Categorize.be_output(list(self._param.category_description.items())[-1][1]["to"])
+
+    def debug(self, **kwargs):
+        df = self._run([], **kwargs)
+        cpn_id = df.iloc[0, 0]
+        return Categorize.be_output(self._canvas.get_component_name(cpn_id))
+
--- a/agent/component/concentrator.py
+++ b/agent/component/concentrator.py
@ -0,0 +1,36 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class ConcentratorParam(ComponentParamBase):
+    """
+    Define the Concentrator component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+
+    def check(self):
+        return True
+
+
+class Concentrator(ComponentBase, ABC):
+    component_name = "Concentrator"
+
+    def _run(self, history, **kwargs):
+        return Concentrator.be_output("")
--- a/agent/component/crawler.py
+++ b/agent/component/crawler.py
@ -0,0 +1,67 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import asyncio
+from crawl4ai import AsyncWebCrawler
+from agent.component.base import ComponentBase, ComponentParamBase
+from api.utils.web_utils import is_valid_url
+
+
+class CrawlerParam(ComponentParamBase):
+    """
+    Define the Crawler component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.proxy = None
+        self.extract_type = "markdown"
+    
+    def check(self):
+        self.check_valid_value(self.extract_type, "Type of content from the crawler", ['html', 'markdown', 'content'])
+
+
+class Crawler(ComponentBase, ABC):
+    component_name = "Crawler"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not is_valid_url(ans):
+            return Crawler.be_output("URL not valid")
+        try:
+            result = asyncio.run(self.get_web(ans))
+
+            return Crawler.be_output(result)
+            
+        except Exception as e:
+            return Crawler.be_output(f"An unexpected error occurred: {str(e)}")
+
+    async def get_web(self, url):
+        proxy = self._param.proxy if self._param.proxy else None
+        async with AsyncWebCrawler(verbose=True, proxy=proxy) as crawler:
+            result = await crawler.arun(
+                url=url,
+                bypass_cache=True
+            )
+            
+            if self._param.extract_type == 'html':
+                return result.cleaned_html
+            elif self._param.extract_type == 'markdown':
+                return result.markdown
+            elif self._param.extract_type == 'content':
+                result.extracted_content
+            return result.markdown
--- a/agent/component/deepl.py
+++ b/agent/component/deepl.py
@ -0,0 +1,61 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+import deepl
+
+
+class DeepLParam(ComponentParamBase):
+    """
+    Define the DeepL component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.auth_key = "xxx"
+        self.parameters = []
+        self.source_lang = 'ZH'
+        self.target_lang = 'EN-GB'
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.source_lang, "Source language",
+                               ['AR', 'BG', 'CS', 'DA', 'DE', 'EL', 'EN', 'ES', 'ET', 'FI', 'FR', 'HU', 'ID', 'IT',
+                                'JA', 'KO', 'LT', 'LV', 'NB', 'NL', 'PL', 'PT', 'RO', 'RU', 'SK', 'SL', 'SV', 'TR',
+                                'UK', 'ZH'])
+        self.check_valid_value(self.target_lang, "Target language",
+                               ['AR', 'BG', 'CS', 'DA', 'DE', 'EL', 'EN-GB', 'EN-US', 'ES', 'ET', 'FI', 'FR', 'HU',
+                                'ID', 'IT', 'JA', 'KO', 'LT', 'LV', 'NB', 'NL', 'PL', 'PT-BR', 'PT-PT', 'RO', 'RU',
+                                'SK', 'SL', 'SV', 'TR', 'UK', 'ZH'])
+
+
+class DeepL(ComponentBase, ABC):
+    component_name = "GitHub"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return DeepL.be_output("")
+
+        try:
+            translator = deepl.Translator(self._param.auth_key)
+            result = translator.translate_text(ans, source_lang=self._param.source_lang,
+                                               target_lang=self._param.target_lang)
+
+            return DeepL.be_output(result.text)
+        except Exception as e:
+            DeepL.be_output("**Error**:" + str(e))
--- a/agent/component/duckduckgo.py
+++ b/agent/component/duckduckgo.py
@ -0,0 +1,66 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+from duckduckgo_search import DDGS
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class DuckDuckGoParam(ComponentParamBase):
+    """
+    Define the DuckDuckGo component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+        self.channel = "text"
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.channel, "Web Search or News", ["text", "news"])
+
+
+class DuckDuckGo(ComponentBase, ABC):
+    component_name = "DuckDuckGo"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return DuckDuckGo.be_output("")
+
+        try:
+            if self._param.channel == "text":
+                with DDGS() as ddgs:
+                    # {'title': '', 'href': '', 'body': ''}
+                    duck_res = [{"content": '<a href="' + i["href"] + '">' + i["title"] + '</a>    ' + i["body"]} for i
+                                in ddgs.text(ans, max_results=self._param.top_n)]
+            elif self._param.channel == "news":
+                with DDGS() as ddgs:
+                    # {'date': '', 'title': '', 'body': '', 'url': '', 'image': '', 'source': ''}
+                    duck_res = [{"content": '<a href="' + i["url"] + '">' + i["title"] + '</a>    ' + i["body"]} for i
+                                in ddgs.news(ans, max_results=self._param.top_n)]
+        except Exception as e:
+            return DuckDuckGo.be_output("**ERROR**: " + str(e))
+
+        if not duck_res:
+            return DuckDuckGo.be_output("")
+
+        df = pd.DataFrame(duck_res)
+        logging.debug("df: {df}")
+        return df
--- a/agent/component/email.py
+++ b/agent/component/email.py
@ -0,0 +1,138 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from abc import ABC
+import json
+import smtplib
+import logging
+from email.mime.text import MIMEText
+from email.mime.multipart import MIMEMultipart
+from email.header import Header
+from email.utils import formataddr
+from agent.component.base import ComponentBase, ComponentParamBase
+
+class EmailParam(ComponentParamBase):
+    """
+    Define the Email component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        # Fixed configuration parameters
+        self.smtp_server = ""  # SMTP server address
+        self.smtp_port = 465  # SMTP port
+        self.email = ""  # Sender email
+        self.password = ""  # Email authorization code
+        self.sender_name = ""  # Sender name
+
+    def check(self):
+        # Check required parameters
+        self.check_empty(self.smtp_server, "SMTP Server")
+        self.check_empty(self.email, "Email")
+        self.check_empty(self.password, "Password")
+        self.check_empty(self.sender_name, "Sender Name")
+
+class Email(ComponentBase, ABC):
+    component_name = "Email"
+    
+    def _run(self, history, **kwargs):
+        # Get upstream component output and parse JSON
+        ans = self.get_input()
+        content = "".join(ans["content"]) if "content" in ans else ""
+        if not content:
+            return Email.be_output("No content to send")
+
+        success = False
+        try:
+            # Parse JSON string passed from upstream
+            email_data = json.loads(content)
+            
+            # Validate required fields
+            if "to_email" not in email_data:
+                return Email.be_output("Missing required field: to_email")
+
+            # Create email object
+            msg = MIMEMultipart('alternative')
+            
+            # Properly handle sender name encoding
+            msg['From'] = formataddr((str(Header(self._param.sender_name,'utf-8')), self._param.email))
+            msg['To'] = email_data["to_email"]
+            if "cc_email" in email_data and email_data["cc_email"]:
+                msg['Cc'] = email_data["cc_email"]
+            msg['Subject'] = Header(email_data.get("subject", "No Subject"), 'utf-8').encode()
+
+            # Use content from email_data or default content
+            email_content = email_data.get("content", "No content provided")
+            # msg.attach(MIMEText(email_content, 'plain', 'utf-8'))
+            msg.attach(MIMEText(email_content, 'html', 'utf-8'))
+
+            # Connect to SMTP server and send
+            logging.info(f"Connecting to SMTP server {self._param.smtp_server}:{self._param.smtp_port}")
+            
+            context = smtplib.ssl.create_default_context()
+            with smtplib.SMTP_SSL(self._param.smtp_server, self._param.smtp_port, context=context) as server:
+                # Login
+                logging.info(f"Attempting to login with email: {self._param.email}")
+                server.login(self._param.email, self._param.password)
+                
+                # Get all recipient list
+                recipients = [email_data["to_email"]]
+                if "cc_email" in email_data and email_data["cc_email"]:
+                    recipients.extend(email_data["cc_email"].split(','))
+                
+                # Send email
+                logging.info(f"Sending email to recipients: {recipients}")
+                try:
+                    server.send_message(msg, self._param.email, recipients)
+                    success = True
+                except Exception as e:
+                    logging.error(f"Error during send_message: {str(e)}")
+                    # Try alternative method
+                    server.sendmail(self._param.email, recipients, msg.as_string())
+                    success = True
+                
+                try:
+                    server.quit()
+                except Exception as e:
+                    # Ignore errors when closing connection
+                    logging.warning(f"Non-fatal error during connection close: {str(e)}")
+
+            if success:
+                return Email.be_output("Email sent successfully")
+
+        except json.JSONDecodeError:
+            error_msg = "Invalid JSON format in input"
+            logging.error(error_msg)
+            return Email.be_output(error_msg)
+            
+        except smtplib.SMTPAuthenticationError:
+            error_msg = "SMTP Authentication failed. Please check your email and authorization code."
+            logging.error(error_msg)
+            return Email.be_output(f"Failed to send email: {error_msg}")
+            
+        except smtplib.SMTPConnectError:
+            error_msg = f"Failed to connect to SMTP server {self._param.smtp_server}:{self._param.smtp_port}"
+            logging.error(error_msg)
+            return Email.be_output(f"Failed to send email: {error_msg}")
+            
+        except smtplib.SMTPException as e:
+            error_msg = f"SMTP error occurred: {str(e)}"
+            logging.error(error_msg)
+            return Email.be_output(f"Failed to send email: {error_msg}")
+            
+        except Exception as e:
+            error_msg = f"Unexpected error: {str(e)}"
+            logging.error(error_msg)
+            return Email.be_output(f"Failed to send email: {error_msg}") 
--- a/agent/component/exesql.py
+++ b/agent/component/exesql.py
@ -0,0 +1,154 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import re
+from copy import deepcopy
+
+import pandas as pd
+import pymysql
+import psycopg2
+from agent.component import GenerateParam, Generate
+import pyodbc
+import logging
+
+
+class ExeSQLParam(GenerateParam):
+    """
+    Define the ExeSQL component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.db_type = "mysql"
+        self.database = ""
+        self.username = ""
+        self.host = ""
+        self.port = 3306
+        self.password = ""
+        self.loop = 3
+        self.top_n = 30
+
+    def check(self):
+        super().check()
+        self.check_valid_value(self.db_type, "Choose DB type", ['mysql', 'postgresql', 'mariadb', 'mssql'])
+        self.check_empty(self.database, "Database name")
+        self.check_empty(self.username, "database username")
+        self.check_empty(self.host, "IP Address")
+        self.check_positive_integer(self.port, "IP Port")
+        self.check_empty(self.password, "Database password")
+        self.check_positive_integer(self.top_n, "Number of records")
+        if self.database == "rag_flow":
+            if self.host == "ragflow-mysql":
+                raise ValueError("For the security reason, it dose not support database named rag_flow.")
+            if self.password == "infini_rag_flow":
+                raise ValueError("For the security reason, it dose not support database named rag_flow.")
+
+
+class ExeSQL(Generate, ABC):
+    component_name = "ExeSQL"
+
+    def _refactor(self, ans):
+        ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
+        match = re.search(r"```sql\s*(.*?)\s*```", ans, re.DOTALL)
+        if match:
+            ans = match.group(1)  # Query content
+            return ans
+        else:
+            print("no markdown")
+        ans = re.sub(r'^.*?SELECT ', 'SELECT ', (ans), flags=re.IGNORECASE)
+        ans = re.sub(r';.*?SELECT ', '; SELECT ', ans, flags=re.IGNORECASE)
+        ans = re.sub(r';[^;]*$', r';', ans)
+        if not ans:
+            raise Exception("SQL statement not found!")
+        return ans
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = "".join([str(a) for a in ans["content"]]) if "content" in ans else ""
+        ans = self._refactor(ans)
+        if self._param.db_type in ["mysql", "mariadb"]:
+            db = pymysql.connect(db=self._param.database, user=self._param.username, host=self._param.host,
+                                 port=self._param.port, password=self._param.password)
+        elif self._param.db_type == 'postgresql':
+            db = psycopg2.connect(dbname=self._param.database, user=self._param.username, host=self._param.host,
+                                  port=self._param.port, password=self._param.password)
+        elif self._param.db_type == 'mssql':
+            conn_str = (
+                    r'DRIVER={ODBC Driver 17 for SQL Server};'
+                    r'SERVER=' + self._param.host + ',' + str(self._param.port) + ';'
+                    r'DATABASE=' + self._param.database + ';'
+                    r'UID=' + self._param.username + ';'
+                    r'PWD=' + self._param.password
+            )
+            db = pyodbc.connect(conn_str)
+        try:
+            cursor = db.cursor()
+        except Exception as e:
+            raise Exception("Database Connection Failed! \n" + str(e))
+        if not hasattr(self, "_loop"):
+            setattr(self, "_loop", 0)
+            self._loop += 1
+        input_list = re.split(r';', ans.replace(r"\n", " "))
+        sql_res = []
+        for i in range(len(input_list)):
+            single_sql = input_list[i]
+            while self._loop <= self._param.loop:
+                self._loop += 1
+                if not single_sql:
+                    break
+                try:
+                    cursor.execute(single_sql)
+                    if cursor.rowcount == 0:
+                        sql_res.append({"content": "No record in the database!"})
+                        break
+                    if self._param.db_type == 'mssql':
+                        single_res = pd.DataFrame.from_records(cursor.fetchmany(self._param.top_n),
+                                                               columns=[desc[0] for desc in cursor.description])
+                    else:
+                        single_res = pd.DataFrame([i for i in cursor.fetchmany(self._param.top_n)])
+                        single_res.columns = [i[0] for i in cursor.description]
+                    sql_res.append({"content": single_res.to_markdown(index=False, floatfmt=".6f")})
+                    break
+                except Exception as e:
+                    single_sql = self._regenerate_sql(single_sql, str(e), **kwargs)
+                    single_sql = self._refactor(single_sql)
+                    if self._loop > self._param.loop:
+                        sql_res.append({"content": "Can't query the correct data via SQL statement."})
+        db.close()
+        if not sql_res:
+            return ExeSQL.be_output("")
+        return pd.DataFrame(sql_res)
+
+    def _regenerate_sql(self, failed_sql, error_message, **kwargs):
+        prompt = f'''
+        ## You are the Repair SQL Statement Helper, please modify the original SQL statement based on the SQL query error report.
+        ## The original SQL statement is as follows:{failed_sql}.
+        ## The contents of the SQL query error report is as follows:{error_message}.
+        ## Answer only the modified SQL statement. Please do not give any explanation, just answer the code.
+'''
+        self._param.prompt = prompt
+        kwargs_ = deepcopy(kwargs)
+        kwargs_["stream"] = False
+        response = Generate._run(self, [], **kwargs_)
+        try:
+            regenerated_sql = response.loc[0, "content"]
+            return regenerated_sql
+        except Exception as e:
+            logging.error(f"Failed to regenerate SQL: {e}")
+            return None
+
+    def debug(self, **kwargs):
+        return self._run([], **kwargs)
--- a/agent/component/generate.py
+++ b/agent/component/generate.py
@ -0,0 +1,248 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import re
+from functools import partial
+import pandas as pd
+from api.db import LLMType
+from api.db.services.conversation_service import structure_answer
+from api.db.services.llm_service import LLMBundle
+from api import settings
+from agent.component.base import ComponentBase, ComponentParamBase
+from rag.prompts import message_fit_in
+
+
+class GenerateParam(ComponentParamBase):
+    """
+    Define the Generate component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.llm_id = ""
+        self.prompt = ""
+        self.max_tokens = 0
+        self.temperature = 0
+        self.top_p = 0
+        self.presence_penalty = 0
+        self.frequency_penalty = 0
+        self.cite = True
+        self.parameters = []
+
+    def check(self):
+        self.check_decimal_float(self.temperature, "[Generate] Temperature")
+        self.check_decimal_float(self.presence_penalty, "[Generate] Presence penalty")
+        self.check_decimal_float(self.frequency_penalty, "[Generate] Frequency penalty")
+        self.check_nonnegative_number(self.max_tokens, "[Generate] Max tokens")
+        self.check_decimal_float(self.top_p, "[Generate] Top P")
+        self.check_empty(self.llm_id, "[Generate] LLM")
+        # self.check_defined_type(self.parameters, "Parameters", ["list"])
+
+    def gen_conf(self):
+        conf = {}
+        if self.max_tokens > 0:
+            conf["max_tokens"] = self.max_tokens
+        if self.temperature > 0:
+            conf["temperature"] = self.temperature
+        if self.top_p > 0:
+            conf["top_p"] = self.top_p
+        if self.presence_penalty > 0:
+            conf["presence_penalty"] = self.presence_penalty
+        if self.frequency_penalty > 0:
+            conf["frequency_penalty"] = self.frequency_penalty
+        return conf
+
+
+class Generate(ComponentBase):
+    component_name = "Generate"
+
+    def get_dependent_components(self):
+        inputs = self.get_input_elements()
+        cpnts = set([i["key"] for i in inputs[1:] if i["key"].lower().find("answer") < 0 and i["key"].lower().find("begin") < 0])
+        return list(cpnts)
+
+    def set_cite(self, retrieval_res, answer):
+        retrieval_res = retrieval_res.dropna(subset=["vector", "content_ltks"]).reset_index(drop=True)
+        if "empty_response" in retrieval_res.columns:
+            retrieval_res["empty_response"].fillna("", inplace=True)
+        answer, idx = settings.retrievaler.insert_citations(answer,
+                                                            [ck["content_ltks"] for _, ck in retrieval_res.iterrows()],
+                                                            [ck["vector"] for _, ck in retrieval_res.iterrows()],
+                                                            LLMBundle(self._canvas.get_tenant_id(), LLMType.EMBEDDING,
+                                                                      self._canvas.get_embedding_model()), tkweight=0.7,
+                                                            vtweight=0.3)
+        doc_ids = set([])
+        recall_docs = []
+        for i in idx:
+            did = retrieval_res.loc[int(i), "doc_id"]
+            if did in doc_ids:
+                continue
+            doc_ids.add(did)
+            recall_docs.append({"doc_id": did, "doc_name": retrieval_res.loc[int(i), "docnm_kwd"]})
+
+        del retrieval_res["vector"]
+        del retrieval_res["content_ltks"]
+
+        reference = {
+            "chunks": [ck.to_dict() for _, ck in retrieval_res.iterrows()],
+            "doc_aggs": recall_docs
+        }
+
+        if answer.lower().find("invalid key") >= 0 or answer.lower().find("invalid api") >= 0:
+            answer += " Please set LLM API-Key in 'User Setting -> Model providers -> API-Key'"
+        res = {"content": answer, "reference": reference}
+        res = structure_answer(None, res, "", "")
+
+        return res
+
+    def get_input_elements(self):
+        key_set = set([])
+        res = [{"key": "user", "name": "Input your question here:"}]
+        for r in re.finditer(r"\{([a-z]+[:@][a-z0-9_-]+)\}", self._param.prompt, flags=re.IGNORECASE):
+            cpn_id = r.group(1)
+            if cpn_id in key_set:
+                continue
+            if cpn_id.lower().find("begin@") == 0:
+                cpn_id, key = cpn_id.split("@")
+                for p in self._canvas.get_component(cpn_id)["obj"]._param.query:
+                    if p["key"] != key:
+                        continue
+                    res.append({"key": r.group(1), "name": p["name"]})
+                    key_set.add(r.group(1))
+                continue
+            cpn_nm = self._canvas.get_component_name(cpn_id)
+            if not cpn_nm:
+                continue
+            res.append({"key": cpn_id, "name": cpn_nm})
+            key_set.add(cpn_id)
+        return res
+
+    def _run(self, history, **kwargs):
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        prompt = self._param.prompt
+
+        retrieval_res = []
+        self._param.inputs = []
+        for para in self.get_input_elements()[1:]:
+            if para["key"].lower().find("begin@") == 0:
+                cpn_id, key = para["key"].split("@")
+                for p in self._canvas.get_component(cpn_id)["obj"]._param.query:
+                    if p["key"] == key:
+                        kwargs[para["key"]] = p.get("value", "")
+                        self._param.inputs.append(
+                            {"component_id": para["key"], "content": kwargs[para["key"]]})
+                        break
+                else:
+                    assert False, f"Can't find parameter '{key}' for {cpn_id}"
+                continue
+
+            component_id = para["key"]
+            cpn = self._canvas.get_component(component_id)["obj"]
+            if cpn.component_name.lower() == "answer":
+                hist = self._canvas.get_history(1)
+                if hist:
+                    hist = hist[0]["content"]
+                else:
+                    hist = ""
+                kwargs[para["key"]] = hist
+                continue
+            _, out = cpn.output(allow_partial=False)
+            if "content" not in out.columns:
+                kwargs[para["key"]] = ""
+            else:
+                if cpn.component_name.lower() == "retrieval":
+                    retrieval_res.append(out)
+                kwargs[para["key"]] = "  - " + "\n - ".join([o if isinstance(o, str) else str(o) for o in out["content"]])
+            self._param.inputs.append({"component_id": para["key"], "content": kwargs[para["key"]]})
+
+        if retrieval_res:
+            retrieval_res = pd.concat(retrieval_res, ignore_index=True)
+        else:
+            retrieval_res = pd.DataFrame([])
+
+        for n, v in kwargs.items():
+            prompt = re.sub(r"\{%s\}" % re.escape(n), str(v).replace("\\", " "), prompt)
+
+        if not self._param.inputs and prompt.find("{input}") >= 0:
+            retrieval_res = self.get_input()
+            input = ("  - " + "\n  - ".join(
+                [c for c in retrieval_res["content"] if isinstance(c, str)])) if "content" in retrieval_res else ""
+            prompt = re.sub(r"\{input\}", re.escape(input), prompt)
+
+        downstreams = self._canvas.get_component(self._id)["downstream"]
+        if kwargs.get("stream") and len(downstreams) == 1 and self._canvas.get_component(downstreams[0])[
+            "obj"].component_name.lower() == "answer":
+            return partial(self.stream_output, chat_mdl, prompt, retrieval_res)
+
+        if "empty_response" in retrieval_res.columns and not "".join(retrieval_res["content"]):
+            empty_res = "\n- ".join([str(t) for t in retrieval_res["empty_response"] if str(t)])
+            res = {"content": empty_res if empty_res else "Nothing found in knowledgebase!", "reference": []}
+            return pd.DataFrame([res])
+
+        msg = self._canvas.get_history(self._param.message_history_window_size)
+        if len(msg) < 1:
+            msg.append({"role": "user", "content": "Output: "})
+        _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(chat_mdl.max_length * 0.97))
+        if len(msg) < 2:
+            msg.append({"role": "user", "content": "Output: "})
+        ans = chat_mdl.chat(msg[0]["content"], msg[1:], self._param.gen_conf())
+        ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
+
+        if self._param.cite and "content_ltks" in retrieval_res.columns and "vector" in retrieval_res.columns:
+            res = self.set_cite(retrieval_res, ans)
+            return pd.DataFrame([res])
+
+        return Generate.be_output(ans)
+
+    def stream_output(self, chat_mdl, prompt, retrieval_res):
+        res = None
+        if "empty_response" in retrieval_res.columns and not "".join(retrieval_res["content"]):
+            empty_res = "\n- ".join([str(t) for t in retrieval_res["empty_response"] if str(t)])
+            res = {"content": empty_res if empty_res else "Nothing found in knowledgebase!", "reference": []}
+            yield res
+            self.set_output(res)
+            return
+
+        msg = self._canvas.get_history(self._param.message_history_window_size)
+        if len(msg) < 1:
+            msg.append({"role": "user", "content": "Output: "})
+        _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(chat_mdl.max_length * 0.97))
+        if len(msg) < 2:
+            msg.append({"role": "user", "content": "Output: "})
+        answer = ""
+        for ans in chat_mdl.chat_streamly(msg[0]["content"], msg[1:], self._param.gen_conf()):
+            res = {"content": ans, "reference": []}
+            answer = ans
+            yield res
+
+        if self._param.cite and "content_ltks" in retrieval_res.columns and "vector" in retrieval_res.columns:
+            res = self.set_cite(retrieval_res, answer)
+            yield res
+
+        self.set_output(Generate.be_output(res))
+
+    def debug(self, **kwargs):
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        prompt = self._param.prompt
+
+        for para in self._param.debug_inputs:
+            kwargs[para["key"]] = para.get("value", "")
+
+        for n, v in kwargs.items():
+            prompt = re.sub(r"\{%s\}" % re.escape(n), str(v).replace("\\", " "), prompt)
+
+        u = kwargs.get("user")
+        ans = chat_mdl.chat(prompt, [{"role": "user", "content": u if u else "Output: "}], self._param.gen_conf())
+        return pd.DataFrame([ans])
--- a/agent/component/github.py
+++ b/agent/component/github.py
@ -0,0 +1,61 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import pandas as pd
+import requests
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class GitHubParam(ComponentParamBase):
+    """
+    Define the GitHub component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+
+
+class GitHub(ComponentBase, ABC):
+    component_name = "GitHub"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return GitHub.be_output("")
+
+        try:
+            url = 'https://api.github.com/search/repositories?q=' + ans + '&sort=stars&order=desc&per_page=' + str(
+                self._param.top_n)
+            headers = {"Content-Type": "application/vnd.github+json", "X-GitHub-Api-Version": '2022-11-28'}
+            response = requests.get(url=url, headers=headers).json()
+
+            github_res = [{"content": '<a href="' + i["html_url"] + '">' + i["name"] + '</a>' + str(
+                i["description"]) + '\n stars:' + str(i['watchers'])} for i in response['items']]
+        except Exception as e:
+            return GitHub.be_output("**ERROR**: " + str(e))
+
+        if not github_res:
+            return GitHub.be_output("")
+
+        df = pd.DataFrame(github_res)
+        logging.debug(f"df: {df}")
+        return df
--- a/agent/component/google.py
+++ b/agent/component/google.py
@ -0,0 +1,96 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+from serpapi import GoogleSearch
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class GoogleParam(ComponentParamBase):
+    """
+    Define the Google component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+        self.api_key = "xxx"
+        self.country = "cn"
+        self.language = "en"
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_empty(self.api_key, "SerpApi API key")
+        self.check_valid_value(self.country, "Google Country",
+                               ['af', 'al', 'dz', 'as', 'ad', 'ao', 'ai', 'aq', 'ag', 'ar', 'am', 'aw', 'au', 'at',
+                                'az', 'bs', 'bh', 'bd', 'bb', 'by', 'be', 'bz', 'bj', 'bm', 'bt', 'bo', 'ba', 'bw',
+                                'bv', 'br', 'io', 'bn', 'bg', 'bf', 'bi', 'kh', 'cm', 'ca', 'cv', 'ky', 'cf', 'td',
+                                'cl', 'cn', 'cx', 'cc', 'co', 'km', 'cg', 'cd', 'ck', 'cr', 'ci', 'hr', 'cu', 'cy',
+                                'cz', 'dk', 'dj', 'dm', 'do', 'ec', 'eg', 'sv', 'gq', 'er', 'ee', 'et', 'fk', 'fo',
+                                'fj', 'fi', 'fr', 'gf', 'pf', 'tf', 'ga', 'gm', 'ge', 'de', 'gh', 'gi', 'gr', 'gl',
+                                'gd', 'gp', 'gu', 'gt', 'gn', 'gw', 'gy', 'ht', 'hm', 'va', 'hn', 'hk', 'hu', 'is',
+                                'in', 'id', 'ir', 'iq', 'ie', 'il', 'it', 'jm', 'jp', 'jo', 'kz', 'ke', 'ki', 'kp',
+                                'kr', 'kw', 'kg', 'la', 'lv', 'lb', 'ls', 'lr', 'ly', 'li', 'lt', 'lu', 'mo', 'mk',
+                                'mg', 'mw', 'my', 'mv', 'ml', 'mt', 'mh', 'mq', 'mr', 'mu', 'yt', 'mx', 'fm', 'md',
+                                'mc', 'mn', 'ms', 'ma', 'mz', 'mm', 'na', 'nr', 'np', 'nl', 'an', 'nc', 'nz', 'ni',
+                                'ne', 'ng', 'nu', 'nf', 'mp', 'no', 'om', 'pk', 'pw', 'ps', 'pa', 'pg', 'py', 'pe',
+                                'ph', 'pn', 'pl', 'pt', 'pr', 'qa', 're', 'ro', 'ru', 'rw', 'sh', 'kn', 'lc', 'pm',
+                                'vc', 'ws', 'sm', 'st', 'sa', 'sn', 'rs', 'sc', 'sl', 'sg', 'sk', 'si', 'sb', 'so',
+                                'za', 'gs', 'es', 'lk', 'sd', 'sr', 'sj', 'sz', 'se', 'ch', 'sy', 'tw', 'tj', 'tz',
+                                'th', 'tl', 'tg', 'tk', 'to', 'tt', 'tn', 'tr', 'tm', 'tc', 'tv', 'ug', 'ua', 'ae',
+                                'uk', 'gb', 'us', 'um', 'uy', 'uz', 'vu', 've', 'vn', 'vg', 'vi', 'wf', 'eh', 'ye',
+                                'zm', 'zw'])
+        self.check_valid_value(self.language, "Google languages",
+                               ['af', 'ak', 'sq', 'ws', 'am', 'ar', 'hy', 'az', 'eu', 'be', 'bem', 'bn', 'bh',
+                                'xx-bork', 'bs', 'br', 'bg', 'bt', 'km', 'ca', 'chr', 'ny', 'zh-cn', 'zh-tw', 'co',
+                                'hr', 'cs', 'da', 'nl', 'xx-elmer', 'en', 'eo', 'et', 'ee', 'fo', 'tl', 'fi', 'fr',
+                                'fy', 'gaa', 'gl', 'ka', 'de', 'el', 'kl', 'gn', 'gu', 'xx-hacker', 'ht', 'ha', 'haw',
+                                'iw', 'hi', 'hu', 'is', 'ig', 'id', 'ia', 'ga', 'it', 'ja', 'jw', 'kn', 'kk', 'rw',
+                                'rn', 'xx-klingon', 'kg', 'ko', 'kri', 'ku', 'ckb', 'ky', 'lo', 'la', 'lv', 'ln', 'lt',
+                                'loz', 'lg', 'ach', 'mk', 'mg', 'ms', 'ml', 'mt', 'mv', 'mi', 'mr', 'mfe', 'mo', 'mn',
+                                'sr-me', 'my', 'ne', 'pcm', 'nso', 'no', 'nn', 'oc', 'or', 'om', 'ps', 'fa',
+                                'xx-pirate', 'pl', 'pt', 'pt-br', 'pt-pt', 'pa', 'qu', 'ro', 'rm', 'nyn', 'ru', 'gd',
+                                'sr', 'sh', 'st', 'tn', 'crs', 'sn', 'sd', 'si', 'sk', 'sl', 'so', 'es', 'es-419', 'su',
+                                'sw', 'sv', 'tg', 'ta', 'tt', 'te', 'th', 'ti', 'to', 'lua', 'tum', 'tr', 'tk', 'tw',
+                                'ug', 'uk', 'ur', 'uz', 'vu', 'vi', 'cy', 'wo', 'xh', 'yi', 'yo', 'zu']
+                               )
+
+
+class Google(ComponentBase, ABC):
+    component_name = "Google"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return Google.be_output("")
+
+        try:
+            client = GoogleSearch(
+                {"engine": "google", "q": ans, "api_key": self._param.api_key, "gl": self._param.country,
+                 "hl": self._param.language, "num": self._param.top_n})
+            google_res = [{"content": '<a href="' + i["link"] + '">' + i["title"] + '</a>    ' + i["snippet"]} for i in
+                          client.get_dict()["organic_results"]]
+        except Exception:
+            return Google.be_output("**ERROR**: Existing Unavailable Parameters!")
+
+        if not google_res:
+            return Google.be_output("")
+
+        df = pd.DataFrame(google_res)
+        logging.debug(f"df: {df}")
+        return df
--- a/agent/component/googlescholar.py
+++ b/agent/component/googlescholar.py
@ -0,0 +1,70 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+from scholarly import scholarly
+
+
+class GoogleScholarParam(ComponentParamBase):
+    """
+    Define the GoogleScholar component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 6
+        self.sort_by = 'relevance'
+        self.year_low = None
+        self.year_high = None
+        self.patents = True
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.sort_by, "GoogleScholar Sort_by", ['date', 'relevance'])
+        self.check_boolean(self.patents, "Whether or not to include patents, defaults to True")
+
+
+class GoogleScholar(ComponentBase, ABC):
+    component_name = "GoogleScholar"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return GoogleScholar.be_output("")
+
+        scholar_client = scholarly.search_pubs(ans, patents=self._param.patents, year_low=self._param.year_low,
+                                               year_high=self._param.year_high, sort_by=self._param.sort_by)
+        scholar_res = []
+        for i in range(self._param.top_n):
+            try:
+                pub = next(scholar_client)
+                scholar_res.append({"content": 'Title: ' + pub['bib']['title'] + '\n_Url: <a href="' + pub[
+                    'pub_url'] + '"></a> ' + "\n author: " + ",".join(pub['bib']['author']) + '\n Abstract: ' + pub[
+                                                   'bib'].get('abstract', 'no abstract')})
+
+            except StopIteration or Exception:
+                logging.exception("GoogleScholar")
+                break
+
+        if not scholar_res:
+            return GoogleScholar.be_output("")
+
+        df = pd.DataFrame(scholar_res)
+        logging.debug(f"df: {df}")
+        return df
--- a/agent/component/invoke.py
+++ b/agent/component/invoke.py
@ -0,0 +1,132 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+import re
+from abc import ABC
+import requests
+from deepdoc.parser import HtmlParser
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class InvokeParam(ComponentParamBase):
+    """
+    Define the Crawler component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.proxy = None
+        self.headers = ""
+        self.method = "get"
+        self.variables = []
+        self.url = ""
+        self.timeout = 60
+        self.clean_html = False
+        self.datatype = "json"  # New parameter to determine data posting type
+
+    def check(self):
+        self.check_valid_value(self.method.lower(), "Type of content from the crawler", ['get', 'post', 'put'])
+        self.check_empty(self.url, "End point URL")
+        self.check_positive_integer(self.timeout, "Timeout time in second")
+        self.check_boolean(self.clean_html, "Clean HTML")
+        self.check_valid_value(self.datatype.lower(), "Data post type", ['json', 'formdata'])  # Check for valid datapost value
+
+
+class Invoke(ComponentBase, ABC):
+    component_name = "Invoke"
+
+    def _run(self, history, **kwargs):
+        args = {}
+        for para in self._param.variables:
+            if para.get("component_id"):
+                if '@' in para["component_id"]:
+                    component = para["component_id"].split('@')[0]
+                    field = para["component_id"].split('@')[1]
+                    cpn = self._canvas.get_component(component)["obj"]
+                    for param in cpn._param.query:
+                        if param["key"] == field:
+                            if "value" in param:
+                                args[para["key"]] = param["value"]
+                else:
+                    cpn = self._canvas.get_component(para["component_id"])["obj"]
+                    if cpn.component_name.lower() == "answer":
+                        args[para["key"]] = self._canvas.get_history(1)[0]["content"]
+                        continue
+                    _, out = cpn.output(allow_partial=False)
+                    if not out.empty:
+                        args[para["key"]] = "\n".join(out["content"])
+            else:
+                args[para["key"]] = para["value"]
+
+        url = self._param.url.strip()
+        if url.find("http") != 0:
+            url = "http://" + url
+
+        method = self._param.method.lower()
+        headers = {}
+        if self._param.headers:
+            headers = json.loads(self._param.headers)
+        proxies = None
+        if re.sub(r"https?:?/?/?", "", self._param.proxy):
+            proxies = {"http": self._param.proxy, "https": self._param.proxy}
+
+        if method == 'get':
+            response = requests.get(url=url,
+                                    params=args,
+                                    headers=headers,
+                                    proxies=proxies,
+                                    timeout=self._param.timeout)
+            if self._param.clean_html:
+                sections = HtmlParser()(None, response.content)
+                return Invoke.be_output("\n".join(sections))
+
+            return Invoke.be_output(response.text)
+
+        if method == 'put':
+            if self._param.datatype.lower() == 'json':
+                response = requests.put(url=url,
+                                        json=args,
+                                        headers=headers,
+                                        proxies=proxies,
+                                        timeout=self._param.timeout)
+            else:
+                response = requests.put(url=url,
+                                        data=args,
+                                        headers=headers,
+                                        proxies=proxies,
+                                        timeout=self._param.timeout)
+            if self._param.clean_html:
+                sections = HtmlParser()(None, response.content)
+                return Invoke.be_output("\n".join(sections))
+            return Invoke.be_output(response.text)
+
+        if method == 'post':
+            if self._param.datatype.lower() == 'json':
+                response = requests.post(url=url,
+                                         json=args,
+                                         headers=headers,
+                                         proxies=proxies,
+                                         timeout=self._param.timeout)
+            else:
+                response = requests.post(url=url,
+                                         data=args,
+                                         headers=headers,
+                                         proxies=proxies,
+                                         timeout=self._param.timeout)
+            if self._param.clean_html:
+                sections = HtmlParser()(None, response.content)
+                return Invoke.be_output("\n".join(sections))
+            return Invoke.be_output(response.text)
--- a/agent/component/iteration.py
+++ b/agent/component/iteration.py
@ -0,0 +1,45 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class IterationParam(ComponentParamBase):
+    """
+    Define the Iteration component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.delimiter = ","
+
+    def check(self):
+        self.check_empty(self.delimiter, "Delimiter")
+
+
+class Iteration(ComponentBase, ABC):
+    component_name = "Iteration"
+
+    def get_start(self):
+        for cid in self._canvas.components.keys():
+            if self._canvas.get_component(cid)["obj"].component_name.lower() != "iterationitem":
+                continue
+            if self._canvas.get_component(cid)["parent_id"] == self._id:
+                return self._canvas.get_component(cid)
+
+    def _run(self, history, **kwargs):
+        return self.output(allow_partial=False)[1]
+
--- a/agent/component/iterationitem.py
+++ b/agent/component/iterationitem.py
@ -0,0 +1,49 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class IterationItemParam(ComponentParamBase):
+    """
+    Define the IterationItem component parameters.
+    """
+    def check(self):
+        return True
+
+
+class IterationItem(ComponentBase, ABC):
+    component_name = "IterationItem"
+
+    def __init__(self, canvas, id, param: ComponentParamBase):
+        super().__init__(canvas, id, param)
+        self._idx = 0
+
+    def _run(self, history, **kwargs):
+        parent = self.get_parent()
+        ans = parent.get_input()
+        ans = parent._param.delimiter.join(ans["content"]) if "content" in ans else ""
+        ans = [a.strip() for a in ans.split(parent._param.delimiter)]
+        df = pd.DataFrame([{"content": ans[self._idx]}])
+        self._idx += 1
+        if self._idx >= len(ans):
+            self._idx = -1
+        return df
+
+    def end(self):
+        return self._idx == -1
+
--- a/agent/component/jin10.py
+++ b/agent/component/jin10.py
@ -0,0 +1,130 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+from abc import ABC
+import pandas as pd
+import requests
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class Jin10Param(ComponentParamBase):
+    """
+    Define the Jin10 component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.type = "flash"
+        self.secret_key = "xxx"
+        self.flash_type = '1'
+        self.calendar_type = 'cj'
+        self.calendar_datatype = 'data'
+        self.symbols_type = 'GOODS'
+        self.symbols_datatype = 'symbols'
+        self.contain = ""
+        self.filter = ""
+
+    def check(self):
+        self.check_valid_value(self.type, "Type", ['flash', 'calendar', 'symbols', 'news'])
+        self.check_valid_value(self.flash_type, "Flash Type", ['1', '2', '3', '4', '5'])
+        self.check_valid_value(self.calendar_type, "Calendar Type", ['cj', 'qh', 'hk', 'us'])
+        self.check_valid_value(self.calendar_datatype, "Calendar DataType", ['data', 'event', 'holiday'])
+        self.check_valid_value(self.symbols_type, "Symbols Type", ['GOODS', 'FOREX', 'FUTURE', 'CRYPTO'])
+        self.check_valid_value(self.symbols_datatype, 'Symbols DataType', ['symbols', 'quotes'])
+
+
+class Jin10(ComponentBase, ABC):
+    component_name = "Jin10"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return Jin10.be_output("")
+
+        jin10_res = []
+        headers = {'secret-key': self._param.secret_key}
+        try:
+            if self._param.type == "flash":
+                params = {
+                    'category': self._param.flash_type,
+                    'contain': self._param.contain,
+                    'filter': self._param.filter
+                }
+                response = requests.get(
+                    url='https://open-data-api.jin10.com/data-api/flash?category=' + self._param.flash_type,
+                    headers=headers, data=json.dumps(params))
+                response = response.json()
+                for i in response['data']:
+                    jin10_res.append({"content": i['data']['content']})
+            if self._param.type == "calendar":
+                params = {
+                    'category': self._param.calendar_type
+                }
+                response = requests.get(
+                    url='https://open-data-api.jin10.com/data-api/calendar/' + self._param.calendar_datatype + '?category=' + self._param.calendar_type,
+                    headers=headers, data=json.dumps(params))
+
+                response = response.json()
+                jin10_res.append({"content": pd.DataFrame(response['data']).to_markdown()})
+            if self._param.type == "symbols":
+                params = {
+                    'type': self._param.symbols_type
+                }
+                if self._param.symbols_datatype == "quotes":
+                    params['codes'] = 'BTCUSD'
+                response = requests.get(
+                    url='https://open-data-api.jin10.com/data-api/' + self._param.symbols_datatype + '?type=' + self._param.symbols_type,
+                    headers=headers, data=json.dumps(params))
+                response = response.json()
+                if self._param.symbols_datatype == "symbols":
+                    for i in response['data']:
+                        i['Commodity Code'] = i['c']
+                        i['Stock Exchange'] = i['e']
+                        i['Commodity Name'] = i['n']
+                        i['Commodity Type'] = i['t']
+                        del i['c'], i['e'], i['n'], i['t']
+                if self._param.symbols_datatype == "quotes":
+                    for i in response['data']:
+                        i['Selling Price'] = i['a']
+                        i['Buying Price'] = i['b']
+                        i['Commodity Code'] = i['c']
+                        i['Stock Exchange'] = i['e']
+                        i['Highest Price'] = i['h']
+                        i['Yesterday’s Closing Price'] = i['hc']
+                        i['Lowest Price'] = i['l']
+                        i['Opening Price'] = i['o']
+                        i['Latest Price'] = i['p']
+                        i['Market Quote Time'] = i['t']
+                        del i['a'], i['b'], i['c'], i['e'], i['h'], i['hc'], i['l'], i['o'], i['p'], i['t']
+                jin10_res.append({"content": pd.DataFrame(response['data']).to_markdown()})
+            if self._param.type == "news":
+                params = {
+                    'contain': self._param.contain,
+                    'filter': self._param.filter
+                }
+                response = requests.get(
+                    url='https://open-data-api.jin10.com/data-api/news',
+                    headers=headers, data=json.dumps(params))
+                response = response.json()
+                jin10_res.append({"content": pd.DataFrame(response['data']).to_markdown()})
+        except Exception as e:
+            return Jin10.be_output("**ERROR**: " + str(e))
+
+        if not jin10_res:
+            return Jin10.be_output("")
+
+        return pd.DataFrame(jin10_res)
--- a/agent/component/keyword.py
+++ b/agent/component/keyword.py
@ -0,0 +1,65 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+import re
+from abc import ABC
+from api.db import LLMType
+from api.db.services.llm_service import LLMBundle
+from agent.component import GenerateParam, Generate
+
+
+class KeywordExtractParam(GenerateParam):
+    """
+    Define the KeywordExtract component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 1
+
+    def check(self):
+        super().check()
+        self.check_positive_integer(self.top_n, "Top N")
+
+    def get_prompt(self):
+        self.prompt = """
+- Role: You're a question analyzer. 
+- Requirements: 
+  - Summarize user's question, and give top %s important keyword/phrase.
+  - Use comma as a delimiter to separate keywords/phrases.
+- Answer format: (in language of user's question)
+  - keyword: 
+""" % self.top_n
+        return self.prompt
+
+
+class KeywordExtract(Generate, ABC):
+    component_name = "KeywordExtract"
+
+    def _run(self, history, **kwargs):
+        query = self.get_input()
+        query = str(query["content"][0]) if "content" in query else ""
+
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        ans = chat_mdl.chat(self._param.get_prompt(), [{"role": "user", "content": query}],
+                            self._param.gen_conf())
+
+        ans = re.sub(r".*keyword:", "", ans).strip()
+        logging.debug(f"ans: {ans}")
+        return KeywordExtract.be_output(ans)
+
+    def debug(self, **kwargs):
+        return self._run([], **kwargs)
--- a/agent/component/message.py
+++ b/agent/component/message.py
@ -0,0 +1,53 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import random
+from abc import ABC
+from functools import partial
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class MessageParam(ComponentParamBase):
+
+    """
+    Define the Message component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.messages = []
+
+    def check(self):
+        self.check_empty(self.messages, "[Message]")
+        return True
+
+
+class Message(ComponentBase, ABC):
+    component_name = "Message"
+
+    def _run(self, history, **kwargs):
+        if kwargs.get("stream"):
+            return partial(self.stream_output)
+
+        return Message.be_output(random.choice(self._param.messages))
+
+    def stream_output(self):
+        res = None
+        if self._param.messages:
+            res = {"content": random.choice(self._param.messages)}
+            yield res
+
+        self.set_output(res)
+
+
--- a/agent/component/pubmed.py
+++ b/agent/component/pubmed.py
@ -0,0 +1,69 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+from Bio import Entrez
+import re
+import pandas as pd
+import xml.etree.ElementTree as ET
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class PubMedParam(ComponentParamBase):
+    """
+    Define the PubMed component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 5
+        self.email = "A.N.Other@example.com"
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+
+
+class PubMed(ComponentBase, ABC):
+    component_name = "PubMed"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return PubMed.be_output("")
+
+        try:
+            Entrez.email = self._param.email
+            pubmedids = Entrez.read(Entrez.esearch(db='pubmed', retmax=self._param.top_n, term=ans))['IdList']
+            pubmedcnt = ET.fromstring(re.sub(r'<(/?)b>|<(/?)i>', '', Entrez.efetch(db='pubmed', id=",".join(pubmedids),
+                                                                                   retmode="xml").read().decode(
+                "utf-8")))
+            pubmed_res = [{"content": 'Title:' + child.find("MedlineCitation").find("Article").find(
+                "ArticleTitle").text + '\nUrl:<a href=" https://pubmed.ncbi.nlm.nih.gov/' + child.find(
+                "MedlineCitation").find("PMID").text + '">' + '</a>\n' + 'Abstract:' + (
+                                          child.find("MedlineCitation").find("Article").find("Abstract").find(
+                                              "AbstractText").text if child.find("MedlineCitation").find(
+                                              "Article").find("Abstract") else "No abstract available")} for child in
+                          pubmedcnt.findall("PubmedArticle")]
+        except Exception as e:
+            return PubMed.be_output("**ERROR**: " + str(e))
+
+        if not pubmed_res:
+            return PubMed.be_output("")
+
+        df = pd.DataFrame(pubmed_res)
+        logging.debug(f"df: {df}")
+        return df
--- a/agent/component/qweather.py
+++ b/agent/component/qweather.py
@ -0,0 +1,111 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import pandas as pd
+import requests
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class QWeatherParam(ComponentParamBase):
+    """
+    Define the QWeather component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.web_apikey = "xxx"
+        self.lang = "zh"
+        self.type = "weather"
+        self.user_type = 'free'
+        self.error_code = {
+            "204": "The request was successful, but the region you are querying does not have the data you need at this time.",
+            "400": "Request error, may contain incorrect request parameters or missing mandatory request parameters.",
+            "401": "Authentication fails, possibly using the wrong KEY, wrong digital signature, wrong type of KEY (e.g. using the SDK's KEY to access the Web API).",
+            "402": "Exceeded the number of accesses or the balance is not enough to support continued access to the service, you can recharge, upgrade the accesses or wait for the accesses to be reset.",
+            "403": "No access, may be the binding PackageName, BundleID, domain IP address is inconsistent, or the data that requires additional payment.",
+            "404": "The queried data or region does not exist.",
+            "429": "Exceeded the limited QPM (number of accesses per minute), please refer to the QPM description",
+            "500": "No response or timeout, interface service abnormality please contact us"
+            }
+        # Weather
+        self.time_period = 'now'
+
+    def check(self):
+        self.check_empty(self.web_apikey, "BaiduFanyi APPID")
+        self.check_valid_value(self.type, "Type", ["weather", "indices", "airquality"])
+        self.check_valid_value(self.user_type, "Free subscription or paid subscription", ["free", "paid"])
+        self.check_valid_value(self.lang, "Use language",
+                               ['zh', 'zh-hant', 'en', 'de', 'es', 'fr', 'it', 'ja', 'ko', 'ru', 'hi', 'th', 'ar', 'pt',
+                                'bn', 'ms', 'nl', 'el', 'la', 'sv', 'id', 'pl', 'tr', 'cs', 'et', 'vi', 'fil', 'fi',
+                                'he', 'is', 'nb'])
+        self.check_valid_value(self.time_period, "Time period", ['now', '3d', '7d', '10d', '15d', '30d'])
+
+
+class QWeather(ComponentBase, ABC):
+    component_name = "QWeather"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = "".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return QWeather.be_output("")
+
+        try:
+            response = requests.get(
+                url="https://geoapi.qweather.com/v2/city/lookup?location=" + ans + "&key=" + self._param.web_apikey).json()
+            if response["code"] == "200":
+                location_id = response["location"][0]["id"]
+            else:
+                return QWeather.be_output("**Error**" + self._param.error_code[response["code"]])
+
+            base_url = "https://api.qweather.com/v7/" if self._param.user_type == 'paid' else "https://devapi.qweather.com/v7/"
+
+            if self._param.type == "weather":
+                url = base_url + "weather/" + self._param.time_period + "?location=" + location_id + "&key=" + self._param.web_apikey + "&lang=" + self._param.lang
+                response = requests.get(url=url).json()
+                if response["code"] == "200":
+                    if self._param.time_period == "now":
+                        return QWeather.be_output(str(response["now"]))
+                    else:
+                        qweather_res = [{"content": str(i) + "\n"} for i in response["daily"]]
+                        if not qweather_res:
+                            return QWeather.be_output("")
+
+                        df = pd.DataFrame(qweather_res)
+                        return df
+                else:
+                    return QWeather.be_output("**Error**" + self._param.error_code[response["code"]])
+
+            elif self._param.type == "indices":
+                url = base_url + "indices/1d?type=0&location=" + location_id + "&key=" + self._param.web_apikey + "&lang=" + self._param.lang
+                response = requests.get(url=url).json()
+                if response["code"] == "200":
+                    indices_res = response["daily"][0]["date"] + "\n" + "\n".join(
+                        [i["name"] + ": " + i["category"] + ", " + i["text"] for i in response["daily"]])
+                    return QWeather.be_output(indices_res)
+
+                else:
+                    return QWeather.be_output("**Error**" + self._param.error_code[response["code"]])
+
+            elif self._param.type == "airquality":
+                url = base_url + "air/now?location=" + location_id + "&key=" + self._param.web_apikey + "&lang=" + self._param.lang
+                response = requests.get(url=url).json()
+                if response["code"] == "200":
+                    return QWeather.be_output(str(response["now"]))
+                else:
+                    return QWeather.be_output("**Error**" + self._param.error_code[response["code"]])
+        except Exception as e:
+            return QWeather.be_output("**Error**" + str(e))
--- a/agent/component/relevant.py
+++ b/agent/component/relevant.py
@ -0,0 +1,83 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+from api.db import LLMType
+from api.db.services.llm_service import LLMBundle
+from agent.component import GenerateParam, Generate
+from rag.utils import num_tokens_from_string, encoder
+
+
+class RelevantParam(GenerateParam):
+
+    """
+    Define the Relevant component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.prompt = ""
+        self.yes = ""
+        self.no = ""
+
+    def check(self):
+        super().check()
+        self.check_empty(self.yes, "[Relevant] 'Yes'")
+        self.check_empty(self.no, "[Relevant] 'No'")
+
+    def get_prompt(self):
+        self.prompt = """
+        You are a grader assessing relevance of a retrieved document to a user question. 
+        It does not need to be a stringent test. The goal is to filter out erroneous retrievals.
+        If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant. 
+        Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question.
+        No other words needed except 'yes' or 'no'.
+        """
+        return self.prompt
+
+
+class Relevant(Generate, ABC):
+    component_name = "Relevant"
+
+    def _run(self, history, **kwargs):
+        q = ""
+        for r, c in self._canvas.history[::-1]:
+            if r == "user":
+                q = c
+                break
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return Relevant.be_output(self._param.no)
+        ans = "Documents: \n" + ans
+        ans = f"Question: {q}\n" + ans
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+
+        if num_tokens_from_string(ans) >= chat_mdl.max_length - 4:
+            ans = encoder.decode(encoder.encode(ans)[:chat_mdl.max_length - 4])
+
+        ans = chat_mdl.chat(self._param.get_prompt(), [{"role": "user", "content": ans}],
+                            self._param.gen_conf())
+
+        logging.debug(ans)
+        if ans.lower().find("yes") >= 0:
+            return Relevant.be_output(self._param.yes)
+        if ans.lower().find("no") >= 0:
+            return Relevant.be_output(self._param.no)
+        assert False, f"Relevant component got: {ans}"
+
+    def debug(self, **kwargs):
+        return self._run([], **kwargs)
+
--- a/agent/component/retrieval.py
+++ b/agent/component/retrieval.py
@ -0,0 +1,89 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+
+import pandas as pd
+
+from api.db import LLMType
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import LLMBundle
+from api import settings
+from agent.component.base import ComponentBase, ComponentParamBase
+from rag.app.tag import label_question
+
+
+class RetrievalParam(ComponentParamBase):
+
+    """
+    Define the Retrieval component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.similarity_threshold = 0.2
+        self.keywords_similarity_weight = 0.5
+        self.top_n = 8
+        self.top_k = 1024
+        self.kb_ids = []
+        self.rerank_id = ""
+        self.empty_response = ""
+
+    def check(self):
+        self.check_decimal_float(self.similarity_threshold, "[Retrieval] Similarity threshold")
+        self.check_decimal_float(self.keywords_similarity_weight, "[Retrieval] Keyword similarity weight")
+        self.check_positive_number(self.top_n, "[Retrieval] Top N")
+
+
+class Retrieval(ComponentBase, ABC):
+    component_name = "Retrieval"
+
+    def _run(self, history, **kwargs):
+        query = self.get_input()
+        query = str(query["content"][0]) if "content" in query else ""
+
+        kbs = KnowledgebaseService.get_by_ids(self._param.kb_ids)
+        if not kbs:
+            return Retrieval.be_output("")
+
+        embd_nms = list(set([kb.embd_id for kb in kbs]))
+        assert len(embd_nms) == 1, "Knowledge bases use different embedding models."
+
+        embd_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.EMBEDDING, embd_nms[0])
+        self._canvas.set_embedding_model(embd_nms[0])
+
+        rerank_mdl = None
+        if self._param.rerank_id:
+            rerank_mdl = LLMBundle(kbs[0].tenant_id, LLMType.RERANK, self._param.rerank_id)
+
+        kbinfos = settings.retrievaler.retrieval(query, embd_mdl, kbs[0].tenant_id, self._param.kb_ids,
+                                        1, self._param.top_n,
+                                        self._param.similarity_threshold, 1 - self._param.keywords_similarity_weight,
+                                        aggs=False, rerank_mdl=rerank_mdl,
+                                        rank_feature=label_question(query, kbs))
+
+        if not kbinfos["chunks"]:
+            df = Retrieval.be_output("")
+            if self._param.empty_response and self._param.empty_response.strip():
+                df["empty_response"] = self._param.empty_response
+            return df
+
+        df = pd.DataFrame(kbinfos["chunks"])
+        df["content"] = df["content_with_weight"]
+        del df["content_with_weight"]
+        logging.debug("{} {}".format(query, df))
+        return df
+
+
--- a/agent/component/rewrite.py
+++ b/agent/component/rewrite.py
@ -0,0 +1,142 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from api.db import LLMType
+from api.db.services.llm_service import LLMBundle
+from agent.component import GenerateParam, Generate
+
+
+class RewriteQuestionParam(GenerateParam):
+    """
+    Define the QuestionRewrite component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.temperature = 0.9
+        self.prompt = ""
+        self.language = ""
+
+    def check(self):
+        super().check()
+
+    def get_prompt(self, conv, language, query):
+        prompt = """
+Role: A helpful assistant
+Task: Generate a full user question that would follow the conversation.
+Requirements & Restrictions:
+  - Text generated MUST be in the same language of the original user's question.
+  - If the user's latest question is completely, don't do anything, just return the original question.
+  - DON'T generate anything except a refined question."""
+
+        if language:
+            prompt += f"""
+  - Text generated MUST be in {language}"""
+
+        prompt += f"""
+######################
+-Examples-
+######################
+# Example 1
+## Conversation
+USER: What is the name of Donald Trump's father?
+ASSISTANT:  Fred Trump.
+USER: And his mother?
+###############
+Output: What's the name of Donald Trump's mother?
+------------
+# Example 2
+## Conversation
+USER: What is the name of Donald Trump's father?
+ASSISTANT:  Fred Trump.
+USER: And his mother?
+ASSISTANT:  Mary Trump.
+USER: What's her full name?
+###############
+Output: What's the full name of Donald Trump's mother Mary Trump?
+######################
+# Real Data
+## Conversation
+{conv}
+###############
+"""
+        return prompt
+
+
+class RewriteQuestion(Generate, ABC):
+    component_name = "RewriteQuestion"
+
+    def _run(self, history, **kwargs):
+        hist = self._canvas.get_history(self._param.message_history_window_size)
+        query = self.get_input()
+        query = str(query["content"][0]) if "content" in query else ""
+        conv = []
+        for m in hist:
+            if m["role"] not in ["user", "assistant"]:
+                continue
+            conv.append("{}: {}".format(m["role"].upper(), m["content"]))
+        conv = "\n".join(conv)
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        ans = chat_mdl.chat(self._param.get_prompt(conv, self.gen_lang(self._param.language), query),
+                            [{"role": "user", "content": "Output: "}], self._param.gen_conf())
+        self._canvas.history.pop()
+        self._canvas.history.append(("user", ans))
+        return RewriteQuestion.be_output(ans)
+
+    @staticmethod
+    def gen_lang(language):
+        # convert code lang to language word for the prompt
+        language_dict = {'af': 'Afrikaans', 'ak': 'Akan', 'sq': 'Albanian', 'ws': 'Samoan', 'am': 'Amharic',
+                         'ar': 'Arabic', 'hy': 'Armenian', 'az': 'Azerbaijani', 'eu': 'Basque', 'be': 'Belarusian',
+                         'bem': 'Bemba', 'bn': 'Bengali', 'bh': 'Bihari',
+                         'xx-bork': 'Bork', 'bs': 'Bosnian', 'br': 'Breton', 'bg': 'Bulgarian', 'bt': 'Bhutani',
+                         'km': 'Cambodian', 'ca': 'Catalan', 'chr': 'Cherokee', 'ny': 'Chichewa', 'zh-cn': 'Chinese',
+                         'zh-tw': 'Chinese', 'co': 'Corsican',
+                         'hr': 'Croatian', 'cs': 'Czech', 'da': 'Danish', 'nl': 'Dutch', 'xx-elmer': 'Elmer',
+                         'en': 'English', 'eo': 'Esperanto', 'et': 'Estonian', 'ee': 'Ewe', 'fo': 'Faroese',
+                         'tl': 'Filipino', 'fi': 'Finnish', 'fr': 'French',
+                         'fy': 'Frisian', 'gaa': 'Ga', 'gl': 'Galician', 'ka': 'Georgian', 'de': 'German',
+                         'el': 'Greek', 'kl': 'Greenlandic', 'gn': 'Guarani', 'gu': 'Gujarati', 'xx-hacker': 'Hacker',
+                         'ht': 'Haitian Creole', 'ha': 'Hausa', 'haw': 'Hawaiian',
+                         'iw': 'Hebrew', 'hi': 'Hindi', 'hu': 'Hungarian', 'is': 'Icelandic', 'ig': 'Igbo',
+                         'id': 'Indonesian', 'ia': 'Interlingua', 'ga': 'Irish', 'it': 'Italian', 'ja': 'Japanese',
+                         'jw': 'Javanese', 'kn': 'Kannada', 'kk': 'Kazakh', 'rw': 'Kinyarwanda',
+                         'rn': 'Kirundi', 'xx-klingon': 'Klingon', 'kg': 'Kongo', 'ko': 'Korean', 'kri': 'Krio',
+                         'ku': 'Kurdish', 'ckb': 'Kurdish (Sorani)', 'ky': 'Kyrgyz', 'lo': 'Laothian', 'la': 'Latin',
+                         'lv': 'Latvian', 'ln': 'Lingala', 'lt': 'Lithuanian',
+                         'loz': 'Lozi', 'lg': 'Luganda', 'ach': 'Luo', 'mk': 'Macedonian', 'mg': 'Malagasy',
+                         'ms': 'Malay', 'ml': 'Malayalam', 'mt': 'Maltese', 'mv': 'Maldivian', 'mi': 'Maori',
+                         'mr': 'Marathi', 'mfe': 'Mauritian Creole', 'mo': 'Moldavian', 'mn': 'Mongolian',
+                         'sr-me': 'Montenegrin', 'my': 'Burmese', 'ne': 'Nepali', 'pcm': 'Nigerian Pidgin',
+                         'nso': 'Northern Sotho', 'no': 'Norwegian', 'nn': 'Norwegian Nynorsk', 'oc': 'Occitan',
+                         'or': 'Oriya', 'om': 'Oromo', 'ps': 'Pashto', 'fa': 'Persian',
+                         'xx-pirate': 'Pirate', 'pl': 'Polish', 'pt': 'Portuguese', 'pt-br': 'Portuguese (Brazilian)',
+                         'pt-pt': 'Portuguese (Portugal)', 'pa': 'Punjabi', 'qu': 'Quechua', 'ro': 'Romanian',
+                         'rm': 'Romansh', 'nyn': 'Runyankole', 'ru': 'Russian', 'gd': 'Scots Gaelic',
+                         'sr': 'Serbian', 'sh': 'Serbo-Croatian', 'st': 'Sesotho', 'tn': 'Setswana',
+                         'crs': 'Seychellois Creole', 'sn': 'Shona', 'sd': 'Sindhi', 'si': 'Sinhalese', 'sk': 'Slovak',
+                         'sl': 'Slovenian', 'so': 'Somali', 'es': 'Spanish', 'es-419': 'Spanish (Latin America)',
+                         'su': 'Sundanese',
+                         'sw': 'Swahili', 'sv': 'Swedish', 'tg': 'Tajik', 'ta': 'Tamil', 'tt': 'Tatar', 'te': 'Telugu',
+                         'th': 'Thai', 'ti': 'Tigrinya', 'to': 'Tongan', 'lua': 'Tshiluba', 'tum': 'Tumbuka',
+                         'tr': 'Turkish', 'tk': 'Turkmen', 'tw': 'Twi',
+                         'ug': 'Uyghur', 'uk': 'Ukrainian', 'ur': 'Urdu', 'uz': 'Uzbek', 'vu': 'Vanuatu',
+                         'vi': 'Vietnamese', 'cy': 'Welsh', 'wo': 'Wolof', 'xh': 'Xhosa', 'yi': 'Yiddish',
+                         'yo': 'Yoruba', 'zu': 'Zulu'}
+        if language in language_dict:
+            return language_dict[language]
+        else:
+            return ""
--- a/agent/component/switch.py
+++ b/agent/component/switch.py
@ -0,0 +1,131 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class SwitchParam(ComponentParamBase):
+    """
+    Define the Switch component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        """
+        {
+            "logical_operator" : "and | or"
+            "items" : [
+                            {"cpn_id": "categorize:0", "operator": "contains", "value": ""},
+                            {"cpn_id": "categorize:0", "operator": "contains", "value": ""},...],
+            "to": ""
+        }
+        """
+        self.conditions = []
+        self.end_cpn_id = "answer:0"
+        self.operators = ['contains', 'not contains', 'start with', 'end with', 'empty', 'not empty', '=', '≠', '>',
+                          '<', '≥', '≤']
+
+    def check(self):
+        self.check_empty(self.conditions, "[Switch] conditions")
+        for cond in self.conditions:
+            if not cond["to"]:
+                raise ValueError("[Switch] 'To' can not be empty!")
+
+
+class Switch(ComponentBase, ABC):
+    component_name = "Switch"
+
+    def get_dependent_components(self):
+        res = []
+        for cond in self._param.conditions:
+            for item in cond["items"]:
+                if not item["cpn_id"]:
+                    continue
+                if item["cpn_id"].find("begin") >= 0:
+                    continue
+                cid = item["cpn_id"].split("@")[0]
+                res.append(cid)
+
+        return list(set(res))
+
+    def _run(self, history, **kwargs):
+        for cond in self._param.conditions:
+            res = []
+            for item in cond["items"]:
+                if not item["cpn_id"]:
+                    continue
+                cid = item["cpn_id"].split("@")[0]
+                if item["cpn_id"].find("@") > 0:
+                    cpn_id, key = item["cpn_id"].split("@")
+                    for p in self._canvas.get_component(cid)["obj"]._param.query:
+                        if p["key"] == key:
+                            res.append(self.process_operator(p.get("value",""), item["operator"], item.get("value", "")))
+                            break
+                else:
+                    out = self._canvas.get_component(cid)["obj"].output()[1]
+                    cpn_input = "" if "content" not in out.columns else " ".join([str(s) for s in out["content"]])
+                    res.append(self.process_operator(cpn_input, item["operator"], item.get("value", "")))
+
+                if cond["logical_operator"] != "and" and any(res):
+                    return Switch.be_output(cond["to"])
+
+            if all(res):
+                return Switch.be_output(cond["to"])
+
+        return Switch.be_output(self._param.end_cpn_id)
+
+    def process_operator(self, input: str, operator: str, value: str) -> bool:
+        if not isinstance(input, str) or not isinstance(value, str):
+            raise ValueError('Invalid input or value type: string')
+
+        if operator == "contains":
+            return True if value.lower() in input.lower() else False
+        elif operator == "not contains":
+            return True if value.lower() not in input.lower() else False
+        elif operator == "start with":
+            return True if input.lower().startswith(value.lower()) else False
+        elif operator == "end with":
+            return True if input.lower().endswith(value.lower()) else False
+        elif operator == "empty":
+            return True if not input else False
+        elif operator == "not empty":
+            return True if input else False
+        elif operator == "=":
+            return True if input == value else False
+        elif operator == "≠":
+            return True if input != value else False
+        elif operator == ">":
+            try:
+                return True if float(input) > float(value) else False
+            except Exception:
+                return True if input > value else False
+        elif operator == "<":
+            try:
+                return True if float(input) < float(value) else False
+            except Exception:
+                return True if input < value else False
+        elif operator == "≥":
+            try:
+                return True if float(input) >= float(value) else False
+            except Exception:
+                return True if input >= value else False
+        elif operator == "≤":
+            try:
+                return True if float(input) <= float(value) else False
+            except Exception:
+                return True if input <= value else False
+
+        raise ValueError('Not supported operator' + operator)
--- a/agent/component/template.py
+++ b/agent/component/template.py
@ -0,0 +1,136 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+import re
+from agent.component.base import ComponentBase, ComponentParamBase
+from jinja2 import Template as Jinja2Template
+
+
+class TemplateParam(ComponentParamBase):
+    """
+    Define the Generate component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.content = ""
+        self.parameters = []
+
+    def check(self):
+        self.check_empty(self.content, "[Template] Content")
+        return True
+
+
+class Template(ComponentBase):
+    component_name = "Template"
+
+    def get_dependent_components(self):
+        inputs = self.get_input_elements()
+        cpnts = set([i["key"] for i in inputs if i["key"].lower().find("answer") < 0 and i["key"].lower().find("begin") < 0])
+        return list(cpnts)
+
+    def get_input_elements(self):
+        key_set = set([])
+        res = []
+        for r in re.finditer(r"\{([a-z]+[:@][a-z0-9_-]+)\}", self._param.content, flags=re.IGNORECASE):
+            cpn_id = r.group(1)
+            if cpn_id in key_set:
+                continue
+            if cpn_id.lower().find("begin@") == 0:
+                cpn_id, key = cpn_id.split("@")
+                for p in self._canvas.get_component(cpn_id)["obj"]._param.query:
+                    if p["key"] != key:
+                        continue
+                    res.append({"key": r.group(1), "name": p["name"]})
+                    key_set.add(r.group(1))
+                continue
+            cpn_nm = self._canvas.get_component_name(cpn_id)
+            if not cpn_nm:
+                continue
+            res.append({"key": cpn_id, "name": cpn_nm})
+            key_set.add(cpn_id)
+        return res
+
+    def _run(self, history, **kwargs):
+        content = self._param.content
+
+        self._param.inputs = []
+        for para in self.get_input_elements():
+            if para["key"].lower().find("begin@") == 0:
+                cpn_id, key = para["key"].split("@")
+                for p in self._canvas.get_component(cpn_id)["obj"]._param.query:
+                    if p["key"] == key:
+                        value = p.get("value", "")
+                        self.make_kwargs(para, kwargs, value)
+                        break
+                else:
+                    assert False, f"Can't find parameter '{key}' for {cpn_id}"
+                continue
+
+            component_id = para["key"]
+            cpn = self._canvas.get_component(component_id)["obj"]
+            if cpn.component_name.lower() == "answer":
+                hist = self._canvas.get_history(1)
+                if hist:
+                    hist = hist[0]["content"]
+                else:
+                    hist = ""
+                self.make_kwargs(para, kwargs, hist)
+                continue
+
+            _, out = cpn.output(allow_partial=False)
+
+            result = ""
+            if "content" in out.columns:
+                result = "\n".join(
+                    [o if isinstance(o, str) else str(o) for o in out["content"]]
+                )
+
+            self.make_kwargs(para, kwargs, result)
+
+        template = Jinja2Template(content)
+
+        try:
+            content = template.render(kwargs)
+        except Exception:
+            pass
+
+        for n, v in kwargs.items():
+            try:
+                v = json.dumps(v, ensure_ascii=False)
+            except Exception:
+                pass
+            content = re.sub(
+                r"\{%s\}" % re.escape(n), v, content
+            )
+            content = re.sub(
+                r"(\\\"|\")", "", content
+            )
+            content = re.sub(
+                r"(#+)", r" \1 ", content
+            )
+
+        return Template.be_output(content)
+
+    def make_kwargs(self, para, kwargs, value):
+        self._param.inputs.append(
+            {"component_id": para["key"], "content": value}
+        )
+        try:
+            value = json.loads(value)
+        except Exception:
+            pass
+        kwargs[para["key"]] = value
--- a/agent/component/tushare.py
+++ b/agent/component/tushare.py
@ -0,0 +1,72 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+from abc import ABC
+import pandas as pd
+import time
+import requests
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class TuShareParam(ComponentParamBase):
+    """
+    Define the TuShare component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.token = "xxx"
+        self.src = "eastmoney"
+        self.start_date = "2024-01-01 09:00:00"
+        self.end_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
+        self.keyword = ""
+
+    def check(self):
+        self.check_valid_value(self.src, "Quick News Source",
+                               ["sina", "wallstreetcn", "10jqka", "eastmoney", "yuncaijing", "fenghuang", "jinrongjie"])
+
+
+class TuShare(ComponentBase, ABC):
+    component_name = "TuShare"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = ",".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return TuShare.be_output("")
+
+        try:
+            tus_res = []
+            params = {
+                "api_name": "news",
+                "token": self._param.token,
+                "params": {"src": self._param.src, "start_date": self._param.start_date,
+                           "end_date": self._param.end_date}
+            }
+            response = requests.post(url="http://api.tushare.pro", data=json.dumps(params).encode('utf-8'))
+            response = response.json()
+            if response['code'] != 0:
+                return TuShare.be_output(response['msg'])
+            df = pd.DataFrame(response['data']['items'])
+            df.columns = response['data']['fields']
+            tus_res.append({"content": (df[df['content'].str.contains(self._param.keyword, case=False)]).to_markdown()})
+        except Exception as e:
+            return TuShare.be_output("**ERROR**: " + str(e))
+
+        if not tus_res:
+            return TuShare.be_output("")
+
+        return pd.DataFrame(tus_res)
--- a/agent/component/wencai.py
+++ b/agent/component/wencai.py
@ -0,0 +1,80 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import pandas as pd
+import pywencai
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class WenCaiParam(ComponentParamBase):
+    """
+    Define the WenCai component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+        self.query_type = "stock"
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.query_type, "Query type",
+                               ['stock', 'zhishu', 'fund', 'hkstock', 'usstock', 'threeboard', 'conbond', 'insurance',
+                                'futures', 'lccp',
+                                'foreign_exchange'])
+
+
+class WenCai(ComponentBase, ABC):
+    component_name = "WenCai"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = ",".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return WenCai.be_output("")
+
+        try:
+            wencai_res = []
+            res = pywencai.get(query=ans, query_type=self._param.query_type, perpage=self._param.top_n)
+            if isinstance(res, pd.DataFrame):
+                wencai_res.append({"content": res.to_markdown()})
+            if isinstance(res, dict):
+                for item in res.items():
+                    if isinstance(item[1], list):
+                        wencai_res.append({"content": item[0] + "\n" + pd.DataFrame(item[1]).to_markdown()})
+                        continue
+                    if isinstance(item[1], str):
+                        wencai_res.append({"content": item[0] + "\n" + item[1]})
+                        continue
+                    if isinstance(item[1], dict):
+                        if "meta" in item[1].keys():
+                            continue
+                        wencai_res.append({"content": pd.DataFrame.from_dict(item[1], orient='index').to_markdown()})
+                        continue
+                    if isinstance(item[1], pd.DataFrame):
+                        if "image_url" in item[1].columns:
+                            continue
+                        wencai_res.append({"content": item[1].to_markdown()})
+                        continue
+                        
+                    wencai_res.append({"content": item[0] + "\n" + str(item[1])})
+        except Exception as e:
+            return WenCai.be_output("**ERROR**: " + str(e))
+
+        if not wencai_res:
+            return WenCai.be_output("")
+
+        return pd.DataFrame(wencai_res)
--- a/agent/component/wikipedia.py
+++ b/agent/component/wikipedia.py
@ -0,0 +1,67 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import wikipedia
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class WikipediaParam(ComponentParamBase):
+    """
+    Define the Wikipedia component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.top_n = 10
+        self.language = "en"
+
+    def check(self):
+        self.check_positive_integer(self.top_n, "Top N")
+        self.check_valid_value(self.language, "Wikipedia languages",
+                               ['af', 'pl', 'ar', 'ast', 'az', 'bg', 'nan', 'bn', 'be', 'ca', 'cs', 'cy', 'da', 'de',
+                                'et', 'el', 'en', 'es', 'eo', 'eu', 'fa', 'fr', 'gl', 'ko', 'hy', 'hi', 'hr', 'id',
+                                'it', 'he', 'ka', 'lld', 'la', 'lv', 'lt', 'hu', 'mk', 'arz', 'ms', 'min', 'my', 'nl',
+                                'ja', 'nb', 'nn', 'ce', 'uz', 'pt', 'kk', 'ro', 'ru', 'ceb', 'sk', 'sl', 'sr', 'sh',
+                                'fi', 'sv', 'ta', 'tt', 'th', 'tg', 'azb', 'tr', 'uk', 'ur', 'vi', 'war', 'zh', 'yue'])
+
+
+class Wikipedia(ComponentBase, ABC):
+    component_name = "Wikipedia"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = " - ".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return Wikipedia.be_output("")
+
+        try:
+            wiki_res = []
+            wikipedia.set_lang(self._param.language)
+            wiki_engine = wikipedia
+            for wiki_key in wiki_engine.search(ans, results=self._param.top_n):
+                page = wiki_engine.page(title=wiki_key, auto_suggest=False)
+                wiki_res.append({"content": '<a href="' + page.url + '">' + page.title + '</a> ' + page.summary})
+        except Exception as e:
+            return Wikipedia.be_output("**ERROR**: " + str(e))
+
+        if not wiki_res:
+            return Wikipedia.be_output("")
+
+        df = pd.DataFrame(wiki_res)
+        logging.debug(f"df: {df}")
+        return df
--- a/agent/component/yahoofinance.py
+++ b/agent/component/yahoofinance.py
@ -0,0 +1,84 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+from abc import ABC
+import pandas as pd
+from agent.component.base import ComponentBase, ComponentParamBase
+import yfinance as yf
+
+
+class YahooFinanceParam(ComponentParamBase):
+    """
+    Define the YahooFinance component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.info = True
+        self.history = False
+        self.count = False
+        self.financials = False
+        self.income_stmt = False
+        self.balance_sheet = False
+        self.cash_flow_statement = False
+        self.news = True
+
+    def check(self):
+        self.check_boolean(self.info, "get all stock info")
+        self.check_boolean(self.history, "get historical market data")
+        self.check_boolean(self.count, "show share count")
+        self.check_boolean(self.financials, "show financials")
+        self.check_boolean(self.income_stmt, "income statement")
+        self.check_boolean(self.balance_sheet, "balance sheet")
+        self.check_boolean(self.cash_flow_statement, "cash flow statement")
+        self.check_boolean(self.news, "show news")
+
+
+class YahooFinance(ComponentBase, ABC):
+    component_name = "YahooFinance"
+
+    def _run(self, history, **kwargs):
+        ans = self.get_input()
+        ans = "".join(ans["content"]) if "content" in ans else ""
+        if not ans:
+            return YahooFinance.be_output("")
+
+        yohoo_res = []
+        try:
+            msft = yf.Ticker(ans)
+            if self._param.info:
+                yohoo_res.append({"content": "info:\n" + pd.Series(msft.info).to_markdown() + "\n"})
+            if self._param.history:
+                yohoo_res.append({"content": "history:\n" + msft.history().to_markdown() + "\n"})
+            if self._param.financials:
+                yohoo_res.append({"content": "calendar:\n" + pd.DataFrame(msft.calendar).to_markdown() + "\n"})
+            if self._param.balance_sheet:
+                yohoo_res.append({"content": "balance sheet:\n" + msft.balance_sheet.to_markdown() + "\n"})
+                yohoo_res.append(
+                    {"content": "quarterly balance sheet:\n" + msft.quarterly_balance_sheet.to_markdown() + "\n"})
+            if self._param.cash_flow_statement:
+                yohoo_res.append({"content": "cash flow statement:\n" + msft.cashflow.to_markdown() + "\n"})
+                yohoo_res.append(
+                    {"content": "quarterly cash flow statement:\n" + msft.quarterly_cashflow.to_markdown() + "\n"})
+            if self._param.news:
+                yohoo_res.append({"content": "news:\n" + pd.DataFrame(msft.news).to_markdown() + "\n"})
+        except Exception:
+            logging.exception("YahooFinance got exception")
+
+        if not yohoo_res:
+            return YahooFinance.be_output("")
+
+        return pd.DataFrame(yohoo_res)
--- a/agent/settings.py
+++ b/agent/settings.py
@ -0,0 +1,18 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+FLOAT_ZERO = 1e-8
+PARAM_MAXDEPTH = 5
--- a/agent/templates/DB
+++ b/agent/templates/DB
--- a/agent/templates/HR_callout_zh.json
+++ b/agent/templates/HR_callout_zh.json
--- a/agent/templates/customer_service.json
+++ b/agent/templates/customer_service.json
--- a/agent/templates/general_chat_bot.json
+++ b/agent/templates/general_chat_bot.json
--- a/agent/templates/interpreter.json
+++ b/agent/templates/interpreter.json
--- a/agent/templates/investment_advisor.json
+++ b/agent/templates/investment_advisor.json
--- a/agent/templates/medical_consultation.json
+++ b/agent/templates/medical_consultation.json
--- a/agent/templates/research_report.json
+++ b/agent/templates/research_report.json
--- a/agent/templates/seo_blog.json
+++ b/agent/templates/seo_blog.json
--- a/agent/templates/text2sql.json
+++ b/agent/templates/text2sql.json
--- a/agent/templates/websearch_assistant.json
+++ b/agent/templates/websearch_assistant.json
--- a/agent/test/client.py
+++ b/agent/test/client.py
@ -0,0 +1,49 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import argparse
+import os
+from functools import partial
+from agent.canvas import Canvas
+from agent.settings import DEBUG
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    dsl_default_path = os.path.join(
+        os.path.dirname(os.path.realpath(__file__)),
+        "dsl_examples",
+        "retrieval_and_generate.json",
+    )
+    parser.add_argument('-s', '--dsl', default=dsl_default_path, help="input dsl", action='store', required=True)
+    parser.add_argument('-t', '--tenant_id', default=False, help="Tenant ID", action='store', required=True)
+    parser.add_argument('-m', '--stream', default=False, help="Stream output", action='store_true', required=False)
+    args = parser.parse_args()
+
+    canvas = Canvas(open(args.dsl, "r").read(), args.tenant_id)
+    while True:
+        ans = canvas.run(stream=args.stream)
+        print("==================== Bot =====================\n>    ", end='')
+        if args.stream and isinstance(ans, partial):
+            cont = ""
+            for an in ans():
+                print(an["content"][len(cont):], end='', flush=True)
+                cont = an["content"]
+        else:
+            print(ans["content"])
+
+        if DEBUG:
+            print(canvas.path)
+        question = input("\n==================== User =====================\n> ")
+        canvas.add_user_input(question)
--- a/agent/test/dsl_examples/baidu_generate_and_switch.json
+++ b/agent/test/dsl_examples/baidu_generate_and_switch.json
@ -0,0 +1,129 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["baidu:0"],
+                "upstream": ["begin", "message:0","message:1"]
+            },
+            "baidu:0": {
+                "obj": {
+                    "component_name": "Baidu",
+                    "params": {}
+                },
+                "downstream": ["generate:0"],
+                "upstream": ["answer:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+		              "prompt": "You are an intelligent assistant. Please answer the user's question based on what Baidu searched. First, please output the user's question and the content searched by Baidu, and then answer yes, no, or i don't know.Here is the user's question:{user_input}The above is the user's question.Here is what Baidu searched for:{baidu}The above is the content searched by Baidu.",
+                      "temperature": 0.2
+                    },
+                    "parameters": [
+                                      {
+                                          "component_id": "answer:0",
+                                          "id": "69415446-49bf-4d4b-8ec9-ac86066f7709",
+                                          "key": "user_input"
+                                      },
+                                      {
+                                          "component_id": "baidu:0",
+                                          "id": "83363c2a-00a8-402f-a45c-ddc4097d7d8b",
+                                          "key": "baidu"
+                                      }
+                                  ]
+                        },
+                "downstream": ["switch:0"],
+                "upstream": ["baidu:0"]
+            },
+            "switch:0": {
+                "obj": {
+                    "component_name": "Switch",
+                    "params": {
+                      "conditions": [
+                                        {
+                                            "logical_operator" : "or",
+                                            "items" : [
+                                                          {"cpn_id": "generate:0", "operator": "contains", "value": "yes"},
+                                                          {"cpn_id": "generate:0", "operator": "contains", "value": "yeah"}
+                                                      ],
+                                             "to": "message:0"
+                                        },
+                                        {
+                                            "logical_operator" : "and",
+                                            "items" : [
+                                                          {"cpn_id": "generate:0", "operator": "contains", "value": "no"},
+                                                          {"cpn_id": "generate:0", "operator": "not contains", "value": "yes"},
+							  {"cpn_id": "generate:0", "operator": "not contains", "value": "know"}
+                                                      ],
+                                            "to": "message:1"
+                                        },
+                                        {
+                                            "logical_operator" : "",
+                                            "items" : [
+                                                          {"cpn_id": "generate:0", "operator": "contains", "value": "know"}
+                                                      ],
+                                             "to": "message:2"
+                                        }
+                                    ],
+                      "end_cpn_id": "answer:0"
+
+                    }
+                },
+                "downstream": ["message:0","message:1"],
+                "upstream": ["generate:0"]
+            },
+            "message:0": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                        "messages": ["YES YES YES YES YES YES YES YES YES YES YES YES"]
+                    }
+                },
+
+                "upstream": ["switch:0"],
+                "downstream": ["answer:0"]
+            },
+            "message:1": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                        "messages": ["NO NO NO NO NO NO NO NO NO NO NO NO NO NO"]
+                    }
+                },
+
+                "upstream": ["switch:0"],
+                "downstream": ["answer:0"]
+            },
+            "message:2": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                        "messages": ["I DON'T KNOW---------------------------"]
+                    }
+                },
+
+                "upstream": ["switch:0"],
+                "downstream": ["answer:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "reference": {},
+  "path": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/categorize.json
+++ b/agent/test/dsl_examples/categorize.json
@ -0,0 +1,73 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["categorize:0"],
+                "upstream": ["begin"]
+            },
+            "categorize:0": {
+                "obj": {
+                    "component_name": "Categorize",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "category_description": {
+                        "product_related": {
+                          "description": "The question is about the product usage, appearance and how it works.",
+                          "examples": "Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?",
+                          "to": "message:0"
+                        },
+                        "others": {
+                          "description": "The question is not about the product usage, appearance and how it works.",
+                          "examples": "How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
+                          "to": "message:1"
+                        }
+                      }
+                    }
+                },
+                "downstream": ["message:0","message:1"],
+                "upstream": ["answer:0"]
+            },
+            "message:0": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Message 0!!!!!!!"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["categorize:0"]
+            },
+            "message:1": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Message 1!!!!!!!"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["categorize:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "path": [],
+  "reference": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/concentrator_message.json
+++ b/agent/test/dsl_examples/concentrator_message.json
@ -0,0 +1,113 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["categorize:0"],
+                "upstream": ["begin"]
+            },
+            "categorize:0": {
+                "obj": {
+                    "component_name": "Categorize",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "category_description": {
+                        "product_related": {
+                          "description": "The question is about the product usage, appearance and how it works.",
+                          "examples": "Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?",
+                          "to": "concentrator:0"
+                        },
+                        "others": {
+                          "description": "The question is not about the product usage, appearance and how it works.",
+                          "examples": "How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
+                          "to": "concentrator:1"
+                        }
+                      }
+                    }
+                },
+                "downstream": ["concentrator:0","concentrator:1"],
+                "upstream": ["answer:0"]
+            },
+            "concentrator:0": {
+                "obj": {
+                    "component_name": "Concentrator",
+                    "params": {}
+                },
+                "downstream": ["message:0"],
+                "upstream": ["categorize:0"]
+            },
+            "concentrator:1": {
+                "obj": {
+                    "component_name": "Concentrator",
+                    "params": {}
+                },
+                "downstream": ["message:1_0","message:1_1","message:1_2"],
+                "upstream": ["categorize:0"]
+            },
+            "message:0": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Message 0_0!!!!!!!"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["concentrator:0"]
+            },
+            "message:1_0": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Message 1_0!!!!!!!"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["concentrator:1"]
+            },
+            "message:1_1": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Message 1_1!!!!!!!"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["concentrator:1"]
+            },
+            "message:1_2": {
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Message 1_2!!!!!!!"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["concentrator:1"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "path": [],
+  "reference": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/customer_service.json
+++ b/agent/test/dsl_examples/customer_service.json
@ -0,0 +1,157 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi! How can I help you?"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["categorize:0"],
+                "upstream": ["begin", "generate:0", "generate:casual", "generate:answer", "generate:complain", "generate:ask_contact", "message:get_contact"]
+            },
+            "categorize:0": {
+                "obj": {
+                    "component_name": "Categorize",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "category_description": {
+                        "product_related": {
+                          "description": "The question is about the product usage, appearance and how it works.",
+                          "examples": "Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?\nException: Can't connect to ES cluster\nHow to build the RAGFlow image from scratch",
+                          "to": "retrieval:0"
+                        },
+                        "casual": {
+                          "description": "The question is not about the product usage, appearance and how it works. Just casual chat.",
+                          "examples": "How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
+                          "to": "generate:casual"
+                        },
+                        "complain": {
+                          "description": "Complain even curse about the product or service you provide. But the comment is not specific enough.",
+                          "examples": "How bad is it.\nIt's really sucks.\nDamn, for God's sake, can it be more steady?\nShit, I just can't use this shit.\nI can't stand it anymore.",
+                          "to": "generate:complain"
+                        },
+                        "answer": {
+                          "description": "This answer provide a specific contact information, like e-mail, phone number, wechat number, line number, twitter, discord, etc,.",
+                          "examples": "My phone number is 203921\nkevinhu.hk@gmail.com\nThis is my discord number: johndowson_29384",
+                          "to": "message:get_contact"
+                        }
+                      },
+                      "message_history_window_size": 8
+                    }
+                },
+                "downstream": ["retrieval:0", "generate:casual", "generate:complain", "message:get_contact"],
+                "upstream": ["answer:0"]
+            },
+            "generate:casual": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are a customer support. But the customer wants to have a casual chat with you instead of consulting about the product. Be nice, funny, enthusiasm and concern.",
+                      "temperature": 0.9,
+                      "message_history_window_size": 12,
+                      "cite": false
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["categorize:0"]
+            },
+            "generate:complain": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are a customer support. the Customers complain even curse about the products but not specific enough. You need to ask him/her what's the specific problem with the product. Be nice, patient and concern to soothe your customers’ emotions at first place.",
+                      "temperature": 0.9,
+                      "message_history_window_size": 12,
+                      "cite": false
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["categorize:0"]
+            },
+            "retrieval:0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {
+                      "similarity_threshold": 0.2,
+                      "keywords_similarity_weight": 0.3,
+                      "top_n": 6,
+                      "top_k": 1024,
+                      "rerank_id": "BAAI/bge-reranker-v2-m3",
+                      "kb_ids": ["869a236818b811ef91dffa163e197198"]
+                    }
+                },
+                "downstream": ["relevant:0"],
+                "upstream": ["categorize:0"]
+            },
+            "relevant:0": {
+                "obj": {
+                    "component_name": "Relevant",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "temperature": 0.02,
+                      "yes": "generate:answer",
+                      "no": "generate:ask_contact"
+                    }
+                },
+                "downstream": ["generate:answer", "generate:ask_contact"],
+                "upstream": ["retrieval:0"]
+            },
+            "generate:answer": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n      Knowledge base content is as following:\n      {input}\n      The above is the content of knowledge base.",
+                      "temperature": 0.02
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["relevant:0"]
+            },
+            "generate:ask_contact": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are a customer support. But you can't answer to customers' question. You need to request their contact like E-mail, phone number, Wechat number, LINE number, twitter, discord, etc,. Product experts will contact them later. Please do not ask the same question twice.",
+                      "temperature": 0.9,
+                      "message_history_window_size": 12,
+                      "cite": false
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["relevant:0"]
+            },
+            "message:get_contact": {
+                "obj":{
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Okay, I've already write this down. What else I can do for you?",
+                        "Get it. What else I can do for you?",
+                        "Thanks for your trust! Our expert will contact ASAP. So, anything else I can do for you?",
+                        "Thanks! So, anything else I can do for you?"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["categorize:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "path": [],
+  "reference": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/exesql.json
+++ b/agent/test/dsl_examples/exesql.json
@ -0,0 +1,43 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["exesql:0"],
+                "upstream": ["begin", "exesql:0"]
+            },
+            "exesql:0": {
+                "obj": {
+                    "component_name": "ExeSQL",
+                    "params": {
+                                    "database": "rag_flow",
+                                    "username": "root",
+                                    "host": "mysql",
+                                    "port": 3306,
+                                    "password": "infini_rag_flow",
+				    "top_n": 3
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["answer:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "reference": {},
+  "path": [],
+  "answer": []
+}
+
--- a/agent/test/dsl_examples/headhunter_zh.json
+++ b/agent/test/dsl_examples/headhunter_zh.json
@ -0,0 +1,210 @@
+{
+  "components": {
+    "begin": {
+      "obj": {
+        "component_name": "Begin",
+        "params": {
+          "prologue": "您好！我是AGI方向的猎头，了解到您是这方面的大佬，然后冒昧的就联系到您。这边有个机会想和您分享，RAGFlow正在招聘您这个岗位的资深的工程师不知道您那边是不是感兴趣？"
+        }
+      },
+      "downstream": ["answer:0"],
+      "upstream": []
+    },
+    "answer:0": {
+      "obj": {
+        "component_name": "Answer",
+        "params": {}
+      },
+      "downstream": ["categorize:0"],
+      "upstream": ["begin", "message:reject"]
+    },
+    "categorize:0": {
+      "obj": {
+        "component_name": "Categorize",
+        "params": {
+          "llm_id": "deepseek-chat",
+          "category_description": {
+            "about_job": {
+              "description": "该问题关于职位本身或公司的信息。",
+              "examples": "什么岗位？\n汇报对象是谁?\n公司多少人？\n公司有啥产品？\n具体工作内容是啥？\n地点哪里？\n双休吗？",
+              "to": "retrieval:0"
+            },
+            "casual": {
+              "description": "该问题不关于职位本身或公司的信息，属于闲聊。",
+              "examples": "你好\n好久不见\n你男的女的？\n你是猴子派来的救兵吗？\n上午开会了?\n你叫啥？\n最近市场如何?生意好做吗？",
+              "to": "generate:casual"
+            },
+            "interested": {
+              "description": "该回答表示他对于该职位感兴趣。",
+              "examples": "嗯\n说吧\n说说看\n还好吧\n是的\n哦\nyes\n具体说说",
+              "to": "message:introduction"
+            },
+            "answer": {
+              "description": "该回答表示他对于该职位不感兴趣，或感觉受到骚扰。",
+              "examples": "不需要\n不感兴趣\n暂时不看\n不要\nno\n我已经不干这个了\n我不是这个方向的",
+              "to": "message:reject"
+            }
+          }
+        }
+      },
+      "downstream": [
+        "message:introduction",
+        "generate:casual",
+        "message:reject",
+        "retrieval:0"
+      ],
+      "upstream": ["answer:0"]
+    },
+    "message:introduction": {
+      "obj": {
+        "component_name": "Message",
+        "params": {
+          "messages": [
+            "我简单介绍以下：\nRAGFlow 是一款基于深度文档理解构建的开源 RAG（Retrieval-Augmented Generation）引擎。RAGFlow 可以为各种规模的企业及个人提供一套精简的 RAG 工作流程，结合大语言模型（LLM）针对用户各类不同的复杂格式数据提供可靠的问答以及有理有据的引用。https://github.com/infiniflow/ragflow\n您那边还有什么要了解的？"
+          ]
+        }
+      },
+      "downstream": ["answer:1"],
+      "upstream": ["categorize:0"]
+    },
+    "answer:1": {
+      "obj": {
+        "component_name": "Answer",
+        "params": {}
+      },
+      "downstream": ["categorize:1"],
+      "upstream": [
+        "message:introduction",
+        "generate:aboutJob",
+        "generate:casual",
+        "generate:get_wechat",
+        "generate:nowechat"
+      ]
+    },
+    "categorize:1": {
+      "obj": {
+        "component_name": "Categorize",
+        "params": {
+          "llm_id": "deepseek-chat",
+          "category_description": {
+            "about_job": {
+              "description": "该问题关于职位本身或公司的信息。",
+              "examples": "什么岗位？\n汇报对象是谁?\n公司多少人？\n公司有啥产品？\n具体工作内容是啥？\n地点哪里？\n双休吗？",
+              "to": "retrieval:0"
+            },
+            "casual": {
+              "description": "该问题不关于职位本身或公司的信息，属于闲聊。",
+              "examples": "你好\n好久不见\n你男的女的？\n你是猴子派来的救兵吗？\n上午开会了?\n你叫啥？\n最近市场如何?生意好做吗？",
+              "to": "generate:casual"
+            },
+            "wechat": {
+              "description": "该回答表示他愿意加微信,或者已经报了微信号。",
+              "examples": "嗯\n可以\n是的\n哦\nyes\n15002333453\nwindblow_2231",
+              "to": "generate:get_wechat"
+            },
+            "giveup": {
+              "description": "该回答表示他不愿意加微信。",
+              "examples": "不需要\n不感兴趣\n暂时不看\n不要\nno\n不方便\n不知道还要加我微信",
+              "to": "generate:nowechat"
+            }
+          },
+          "message_history_window_size": 8
+        }
+      },
+      "downstream": [
+        "retrieval:0",
+        "generate:casual",
+        "generate:get_wechat",
+        "generate:nowechat"
+      ],
+      "upstream": ["answer:1"]
+    },
+    "generate:casual": {
+      "obj": {
+        "component_name": "Generate",
+        "params": {
+          "llm_id": "deepseek-chat",
+          "prompt": "你是AGI方向的猎头，现在候选人的聊了和职位无关的话题，请耐心的回应候选人，并将话题往该AGI的职位上带，最好能要到候选人微信号以便后面保持联系。",
+          "temperature": 0.9,
+          "message_history_window_size": 12,
+          "cite": false
+        }
+      },
+      "downstream": ["answer:1"],
+      "upstream": ["categorize:0", "categorize:1"]
+    },
+    "retrieval:0": {
+      "obj": {
+        "component_name": "Retrieval",
+        "params": {
+          "similarity_threshold": 0.2,
+          "keywords_similarity_weight": 0.3,
+          "top_n": 6,
+          "top_k": 1024,
+          "rerank_id": "BAAI/bge-reranker-v2-m3",
+          "kb_ids": ["869a236818b811ef91dffa163e197198"]
+        }
+      },
+      "downstream": ["generate:aboutJob"],
+      "upstream": ["categorize:0", "categorize:1"]
+    },
+    "generate:aboutJob": {
+      "obj": {
+        "component_name": "Generate",
+        "params": {
+          "llm_id": "deepseek-chat",
+          "prompt": "你是AGI方向的猎头，候选人问了有关职位或公司的问题，你根据以下职位信息回答。如果职位信息中不包含候选人的问题就回答不清楚、不知道、有待确认等。回答完后引导候选人加微信号，如：\n - 方便加一下微信吗，我把JD发您看看？\n  - 微信号多少，我把详细职位JD发您？\n      职位信息如下:\n      {input}\n      职位信息如上。",
+          "temperature": 0.02
+        }
+      },
+      "downstream": ["answer:1"],
+      "upstream": ["retrieval:0"]
+    },
+    "generate:get_wechat": {
+      "obj": {
+        "component_name": "Generate",
+        "params": {
+          "llm_id": "deepseek-chat",
+          "prompt": "你是AGI方向的猎头，候选人表示不反感加微信，如果对方已经报了微信号，表示感谢和信任并表示马上会加上；如果没有，则问对方微信号多少。你的微信号是weixin_kevin，E-mail是kkk@ragflow.com。说话不要重复。不要总是您好。",
+          "temperature": 0.1,
+          "message_history_window_size": 12,
+          "cite": false
+        }
+      },
+      "downstream": ["answer:1"],
+      "upstream": ["categorize:1"]
+    },
+    "generate:nowechat": {
+      "obj": {
+        "component_name": "Generate",
+        "params": {
+          "llm_id": "deepseek-chat",
+          "prompt": "你是AGI方向的猎头，当你提出加微信时对方表示拒绝。你需要耐心礼貌的回应候选人，表示对于保护隐私信息给予理解，也可以询问他对该职位的看法和顾虑。并在恰当的时机再次询问微信联系方式。也可以鼓励候选人主动与你取得联系。你的微信号是weixin_kevin，E-mail是kkk@ragflow.com。说话不要重复。不要总是您好。",
+          "temperature": 0.1,
+          "message_history_window_size": 12,
+          "cite": false
+        }
+      },
+      "downstream": ["answer:1"],
+      "upstream": ["categorize:1"]
+    },
+    "message:reject": {
+      "obj": {
+        "component_name": "Message",
+        "params": {
+          "messages": [
+            "好的，祝您生活愉快，工作顺利。",
+            "哦，好的，感谢您宝贵的时间！"
+          ]
+        }
+      },
+      "downstream": ["answer:0"],
+      "upstream": ["categorize:0"]
+    }
+  },
+  "history": [],
+  "messages": [],
+  "path": [],
+  "reference": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/intergreper.json
+++ b/agent/test/dsl_examples/intergreper.json
@ -0,0 +1,39 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there! Please enter the text you want to translate in format like: 'text you want to translate' => target language. For an example: 您好！ => English"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["generate:0"],
+                "upstream": ["begin", "generate:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an professional interpreter.\n- Role: an professional interpreter.\n- Input format: content need to be translated => target language. \n- Answer format: => translated content in target language. \n- Examples:\n  - user: 您好！ => English. assistant: => How are you doing!\n  - user: You look good today. => Japanese. assistant: => 今日は調子がいいですね 。\n",
+                      "temperature": 0.5
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["answer:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "reference": {},
+  "path": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/interpreter.json
+++ b/agent/test/dsl_examples/interpreter.json
@ -0,0 +1,39 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there! Please enter the text you want to translate in format like: 'text you want to translate' => target language. For an example: 您好！ => English"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["generate:0"],
+                "upstream": ["begin", "generate:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an professional interpreter.\n- Role: an professional interpreter.\n- Input format: content need to be translated => target language. \n- Answer format: => translated content in target language. \n- Examples:\n  - user: 您好！ => English. assistant: => How are you doing!\n  - user: You look good today. => Japanese. assistant: => 今日は調子がいいですね 。\n",
+                      "temperature": 0.5
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["answer:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "reference": {},
+  "path": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/keyword_wikipedia_and_generate.json
+++ b/agent/test/dsl_examples/keyword_wikipedia_and_generate.json
@ -0,0 +1,62 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["keyword:0"],
+                "upstream": ["begin"]
+            },
+            "keyword:0": {
+                "obj": {
+                    "component_name": "KeywordExtract",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "- Role: You're a question analyzer.\n    - Requirements:\n     - Summarize user's question, and give top %s important keyword/phrase.\n    - Use comma as a delimiter to separate keywords/phrases.\n    - Answer format: (in language of user's question)\n    - keyword: ",
+                      "temperature": 0.2,
+                      "top_n": 1
+                    }
+                },
+                "downstream": ["wikipedia:0"],
+                "upstream": ["answer:0"]
+            },
+            "wikipedia:0": {
+                "obj":{
+                    "component_name": "Wikipedia",
+                    "params": {
+                      "top_n": 10
+                    }
+                },
+                "downstream": ["generate:0"],
+                "upstream": ["keyword:0"]
+            },
+            "generate:1": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please answer the question based on content from Wikipedia. When the answer from Wikipedia is incomplete, you need to output the URL link of the corresponding content as well. When all the content searched from Wikipedia is irrelevant to the question, your answer must include the sentence, \"The answer you are looking for is not found in the Wikipedia!\". Answers need to consider chat history.\n       The content of Wikipedia is as follows:\n    {input}\n     The above is the content of Wikipedia.",
+                      "temperature": 0.2
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["wikipedia:0"]
+            }
+  },
+  "history": [],
+  "path": [],
+  "messages": [],
+  "reference": {},
+  "answer": []
+}
--- a/agent/test/dsl_examples/retrieval_and_generate.json
+++ b/agent/test/dsl_examples/retrieval_and_generate.json
@ -0,0 +1,54 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["retrieval:0"],
+                "upstream": ["begin", "generate:0"]
+            },
+            "retrieval:0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {
+                      "similarity_threshold": 0.2,
+                      "keywords_similarity_weight": 0.3,
+                      "top_n": 6,
+                      "top_k": 1024,
+                      "rerank_id": "BAAI/bge-reranker-v2-m3",
+                      "kb_ids": ["869a236818b811ef91dffa163e197198"]
+                    }
+                },
+                "downstream": ["generate:0"],
+                "upstream": ["answer:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n      Here is the knowledge base:\n      {input}\n      The above is the knowledge base.",
+                      "temperature": 0.2
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["retrieval:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "reference": {},
+  "path": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/retrieval_categorize_and_generate.json
+++ b/agent/test/dsl_examples/retrieval_categorize_and_generate.json
@ -0,0 +1,88 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["categorize:0"],
+                "upstream": ["begin", "generate:0", "switch:0"]
+            },
+            "categorize:0": {
+                "obj": {
+                    "component_name": "Categorize",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "category_description": {
+                        "product_related": {
+                          "description": "The question is about the product usage, appearance and how it works.",
+                          "examples": "Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?",
+                          "to": "retrieval:0"
+                        },
+                        "others": {
+                          "description": "The question is not about the product usage, appearance and how it works.",
+                          "examples": "How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
+                          "to": "message:0"
+                        }
+                      }
+                    }
+                },
+                "downstream": ["retrieval:0", "message:0"],
+                "upstream": ["answer:0"]
+            },
+            "message:0": {
+                "obj":{
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Sorry, I don't know. I'm an AI bot."
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["categorize:0"]
+            },
+            "retrieval:0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {
+                      "similarity_threshold": 0.2,
+                      "keywords_similarity_weight": 0.3,
+                      "top_n": 6,
+                      "top_k": 1024,
+                      "rerank_id": "BAAI/bge-reranker-v2-m3",
+                      "kb_ids": ["869a236818b811ef91dffa163e197198"]
+                    }
+                },
+                "downstream": ["generate:0"],
+                "upstream": ["switch:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n      Here is the knowledge base:\n      {input}\n      The above is the knowledge base.",
+                      "temperature": 0.2
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["retrieval:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "reference": {},
+  "path": [],
+  "answer": []
+}
--- a/agent/test/dsl_examples/retrieval_relevant_and_generate.json
+++ b/agent/test/dsl_examples/retrieval_relevant_and_generate.json
@ -0,0 +1,82 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["retrieval:0"],
+                "upstream": ["begin", "generate:0", "switch:0"]
+            },
+            "retrieval:0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {
+                      "similarity_threshold": 0.2,
+                      "keywords_similarity_weight": 0.3,
+                      "top_n": 6,
+                      "top_k": 1024,
+                      "rerank_id": "BAAI/bge-reranker-v2-m3",
+                      "kb_ids": ["869a236818b811ef91dffa163e197198"],
+                      "empty_response": "Sorry, knowledge base has noting related information."
+                    }
+                },
+                "downstream": ["relevant:0"],
+                "upstream": ["answer:0"]
+            },
+            "relevant:0": {
+                "obj": {
+                    "component_name": "Relevant",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "temperature": 0.02,
+                      "yes": "generate:0",
+                      "no": "message:0"
+                    }
+                },
+                "downstream": ["message:0", "generate:0"],
+                "upstream": ["retrieval:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n      Knowledge base content is as following:\n      {input}\n      The above is the content of knowledge base.",
+                      "temperature": 0.2
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["relevant:0"]
+            },
+            "message:0": {
+                "obj":{
+                    "component_name": "Message",
+                    "params": {
+                      "messages": [
+                        "Sorry, I don't know. Please leave your contact, our experts will contact you later. What's your e-mail/phone/wechat?",
+                        "I'm an AI bot and not quite sure about this question. Please leave your contact, our experts will contact you later. What's your e-mail/phone/wechat?",
+                        "Can't find answer in my knowledge base. Please leave your contact, our experts will contact you later. What's your e-mail/phone/wechat?"
+                      ]
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["relevant:0"]
+            }
+  },
+  "history": [],
+  "path": [],
+  "messages": [],
+  "reference": {},
+  "answer": []
+}
--- a/agent/test/dsl_examples/retrieval_relevant_keyword_baidu_and_generate.json
+++ b/agent/test/dsl_examples/retrieval_relevant_keyword_baidu_and_generate.json
@ -0,0 +1,103 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["retrieval:0"],
+                "upstream": ["begin"]
+            },
+            "retrieval:0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {
+                      "similarity_threshold": 0.2,
+                      "keywords_similarity_weight": 0.3,
+                      "top_n": 6,
+                      "top_k": 1024,
+                      "rerank_id": "BAAI/bge-reranker-v2-m3",
+                      "kb_ids": ["21ca4e6a2c8911ef8b1e0242ac120006"],
+                      "empty_response": "Sorry, knowledge base has noting related information."
+                    }
+                },
+                "downstream": ["relevant:0"],
+                "upstream": ["answer:0"]
+            },
+            "relevant:0": {
+                "obj": {
+                    "component_name": "Relevant",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "temperature": 0.02,
+                      "yes": "generate:0",
+                      "no": "keyword:0"
+                    }
+                },
+                "downstream": ["keyword:0", "generate:0"],
+                "upstream": ["retrieval:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n      Knowledge base content is as following:\n      {input}\n      The above is the content of knowledge base.",
+                      "temperature": 0.2
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["relevant:0"]
+            },
+            "keyword:0": {
+                "obj": {
+                    "component_name": "KeywordExtract",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "- Role: You're a question analyzer.\n    - Requirements:\n     - Summarize user's question, and give top %s important keyword/phrase.\n    - Use comma as a delimiter to separate keywords/phrases.\n    - Answer format: (in language of user's question)\n    - keyword: ",
+                      "temperature": 0.2,
+                      "top_n": 1
+                    }
+                },
+                "downstream": ["baidu:0"],
+                "upstream": ["relevant:0"]
+            },
+            "baidu:0": {
+                "obj":{
+                    "component_name": "Baidu",
+                    "params": {
+                      "top_n": 10
+                    }
+                },
+                "downstream": ["generate:1"],
+                "upstream": ["keyword:0"]
+            },
+            "generate:1": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please answer the question based on content searched from Baidu. When the answer from a Baidu search is incomplete, you need to output the URL link of the corresponding content as well. When all the content searched from Baidu is irrelevant to the question, your answer must include the sentence, \"The answer you are looking for is not found in the Baidu search!\". Answers need to consider chat history.\n       The content of Baidu search is as follows:\n    {input}\n     The above is the content of Baidu search.",
+                      "temperature": 0.2
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["baidu:0"]
+            }
+  },
+  "history": [],
+  "path": [],
+  "messages": [],
+  "reference": {},
+  "answer": []
+}
--- a/agent/test/dsl_examples/retrieval_relevant_rewrite_and_generate.json
+++ b/agent/test/dsl_examples/retrieval_relevant_rewrite_and_generate.json
@ -0,0 +1,79 @@
+{
+  "components": {
+            "begin": {
+                "obj":{
+                    "component_name": "Begin",
+                    "params": {
+                      "prologue": "Hi there!"
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": []
+            },
+            "answer:0": {
+                "obj": {
+                    "component_name": "Answer",
+                    "params": {}
+                },
+                "downstream": ["retrieval:0"],
+                "upstream": ["begin", "generate:0", "switch:0"]
+            },
+            "retrieval:0": {
+                "obj": {
+                    "component_name": "Retrieval",
+                    "params": {
+                      "similarity_threshold": 0.2,
+                      "keywords_similarity_weight": 0.3,
+                      "top_n": 6,
+                      "top_k": 1024,
+                      "rerank_id": "BAAI/bge-reranker-v2-m3",
+                      "kb_ids": ["869a236818b811ef91dffa163e197198"],
+                      "empty_response": "Sorry, knowledge base has noting related information."
+                    }
+                },
+                "downstream": ["relevant:0"],
+                "upstream": ["answer:0", "rewrite:0"]
+            },
+            "relevant:0": {
+                "obj": {
+                    "component_name": "Relevant",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "temperature": 0.02,
+                      "yes": "generate:0",
+                      "no": "rewrite:0"
+                    }
+                },
+                "downstream": ["generate:0", "rewrite:0"],
+                "upstream": ["retrieval:0"]
+            },
+            "generate:0": {
+                "obj": {
+                    "component_name": "Generate",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "prompt": "You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n      Knowledge base content is as following:\n      {input}\n      The above is the content of knowledge base.",
+                      "temperature": 0.02
+                    }
+                },
+                "downstream": ["answer:0"],
+                "upstream": ["relevant:0"]
+            },
+            "rewrite:0": {
+                "obj":{
+                    "component_name": "RewriteQuestion",
+                    "params": {
+                      "llm_id": "deepseek-chat",
+                      "temperature": 0.8
+                    }
+                },
+                "downstream": ["retrieval:0"],
+                "upstream": ["relevant:0"]
+            }
+  },
+  "history": [],
+  "messages": [],
+  "path": [],
+  "reference": [],
+  "answer": []
+}
--- a/agentic_reasoning/init.py
+++ b/agentic_reasoning/init.py
@ -0,0 +1 @@
+from .deep_research import DeepResearcher as DeepResearcher
--- a/agentic_reasoning/deep_research.py
+++ b/agentic_reasoning/deep_research.py
@ -0,0 +1,167 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+import re
+from functools import partial
+from agentic_reasoning.prompts import BEGIN_SEARCH_QUERY, BEGIN_SEARCH_RESULT, END_SEARCH_RESULT, MAX_SEARCH_LIMIT, \
+    END_SEARCH_QUERY, REASON_PROMPT, RELEVANT_EXTRACTION_PROMPT
+from api.db.services.llm_service import LLMBundle
+from rag.nlp import extract_between
+from rag.prompts import kb_prompt
+from rag.utils.tavily_conn import Tavily
+
+
+class DeepResearcher:
+    def __init__(self,
+                 chat_mdl: LLMBundle,
+                 prompt_config: dict,
+                 kb_retrieve: partial = None,
+                 kg_retrieve: partial = None
+                 ):
+        self.chat_mdl = chat_mdl
+        self.prompt_config = prompt_config
+        self._kb_retrieve = kb_retrieve
+        self._kg_retrieve = kg_retrieve
+
+    def thinking(self, chunk_info: dict, question: str):
+        def rm_query_tags(line):
+            pattern = re.escape(BEGIN_SEARCH_QUERY) + r"(.*?)" + re.escape(END_SEARCH_QUERY)
+            return re.sub(pattern, "", line)
+
+        def rm_result_tags(line):
+            pattern = re.escape(BEGIN_SEARCH_RESULT) + r"(.*?)" + re.escape(END_SEARCH_RESULT)
+            return re.sub(pattern, "", line)
+
+        executed_search_queries = []
+        msg_hisotry = [{"role": "user", "content": f'Question:\"{question}\"\n'}]
+        all_reasoning_steps = []
+        think = "<think>"
+        for ii in range(MAX_SEARCH_LIMIT + 1):
+            if ii == MAX_SEARCH_LIMIT - 1:
+                summary_think = f"\n{BEGIN_SEARCH_RESULT}\nThe maximum search limit is exceeded. You are not allowed to search.\n{END_SEARCH_RESULT}\n"
+                yield {"answer": think + summary_think + "</think>", "reference": {}, "audio_binary": None}
+                all_reasoning_steps.append(summary_think)
+                msg_hisotry.append({"role": "assistant", "content": summary_think})
+                break
+
+            query_think = ""
+            if msg_hisotry[-1]["role"] != "user":
+                msg_hisotry.append({"role": "user", "content": "Continues reasoning with the new information.\n"})
+            else:
+                msg_hisotry[-1]["content"] += "\n\nContinues reasoning with the new information.\n"
+            for ans in self.chat_mdl.chat_streamly(REASON_PROMPT, msg_hisotry, {"temperature": 0.7}):
+                ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
+                if not ans:
+                    continue
+                query_think = ans
+                yield {"answer": think + rm_query_tags(query_think) + "</think>", "reference": {}, "audio_binary": None}
+
+            think += rm_query_tags(query_think)
+            all_reasoning_steps.append(query_think)
+            queries = extract_between(query_think, BEGIN_SEARCH_QUERY, END_SEARCH_QUERY)
+            if not queries:
+                if ii > 0:
+                    break
+                queries = [question]
+
+            for search_query in queries:
+                logging.info(f"[THINK]Query: {ii}. {search_query}")
+                msg_hisotry.append({"role": "assistant", "content": search_query})
+                think += f"\n\n> {ii +1}. {search_query}\n\n"
+                yield {"answer": think + "</think>", "reference": {}, "audio_binary": None}
+
+                summary_think = ""
+                # The search query has been searched in previous steps.
+                if search_query in executed_search_queries:
+                    summary_think = f"\n{BEGIN_SEARCH_RESULT}\nYou have searched this query. Please refer to previous results.\n{END_SEARCH_RESULT}\n"
+                    yield {"answer": think + summary_think + "</think>", "reference": {}, "audio_binary": None}
+                    all_reasoning_steps.append(summary_think)
+                    msg_hisotry.append({"role": "user", "content": summary_think})
+                    think += summary_think
+                    continue
+
+                truncated_prev_reasoning = ""
+                for i, step in enumerate(all_reasoning_steps):
+                    truncated_prev_reasoning += f"Step {i + 1}: {step}\n\n"
+
+                prev_steps = truncated_prev_reasoning.split('\n\n')
+                if len(prev_steps) <= 5:
+                    truncated_prev_reasoning = '\n\n'.join(prev_steps)
+                else:
+                    truncated_prev_reasoning = ''
+                    for i, step in enumerate(prev_steps):
+                        if i == 0 or i >= len(prev_steps) - 4 or BEGIN_SEARCH_QUERY in step or BEGIN_SEARCH_RESULT in step:
+                            truncated_prev_reasoning += step + '\n\n'
+                        else:
+                            if truncated_prev_reasoning[-len('\n\n...\n\n'):] != '\n\n...\n\n':
+                                truncated_prev_reasoning += '...\n\n'
+                truncated_prev_reasoning = truncated_prev_reasoning.strip('\n')
+
+                # Retrieval procedure:
+                # 1. KB search
+                # 2. Web search (optional)
+                # 3. KG search (optional)
+                kbinfos = self._kb_retrieve(question=search_query) if self._kb_retrieve else {"chunks": [], "doc_aggs": []}
+
+                if self.prompt_config.get("tavily_api_key"):
+                    tav = Tavily(self.prompt_config["tavily_api_key"])
+                    tav_res = tav.retrieve_chunks(" ".join(search_query))
+                    kbinfos["chunks"].extend(tav_res["chunks"])
+                    kbinfos["doc_aggs"].extend(tav_res["doc_aggs"])
+                if self.prompt_config.get("use_kg") and self._kg_retrieve:
+                    ck = self._kg_retrieve(question=search_query)
+                    if ck["content_with_weight"]:
+                        kbinfos["chunks"].insert(0, ck)
+
+                # Merge chunk info for citations
+                if not chunk_info["chunks"]:
+                    for k in chunk_info.keys():
+                        chunk_info[k] = kbinfos[k]
+                else:
+                    cids = [c["chunk_id"] for c in chunk_info["chunks"]]
+                    for c in kbinfos["chunks"]:
+                        if c["chunk_id"] in cids:
+                            continue
+                        chunk_info["chunks"].append(c)
+                    dids = [d["doc_id"] for d in chunk_info["doc_aggs"]]
+                    for d in kbinfos["doc_aggs"]:
+                        if d["doc_id"] in dids:
+                            continue
+                        chunk_info["doc_aggs"].append(d)
+
+                think += "\n\n"
+                for ans in self.chat_mdl.chat_streamly(
+                        RELEVANT_EXTRACTION_PROMPT.format(
+                            prev_reasoning=truncated_prev_reasoning,
+                            search_query=search_query,
+                            document="\n".join(kb_prompt(kbinfos, 4096))
+                        ),
+                        [{"role": "user",
+                          "content": f'Now you should analyze each web page and find helpful information based on the current search query "{search_query}" and previous reasoning steps.'}],
+                        {"temperature": 0.7}):
+                    ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
+                    if not ans:
+                        continue
+                    summary_think = ans
+                    yield {"answer": think + rm_result_tags(summary_think) + "</think>", "reference": {}, "audio_binary": None}
+
+                all_reasoning_steps.append(summary_think)
+                msg_hisotry.append(
+                    {"role": "user", "content": f"\n\n{BEGIN_SEARCH_RESULT}{summary_think}{END_SEARCH_RESULT}\n\n"})
+                think += rm_result_tags(summary_think)
+                logging.info(f"[THINK]Summary: {ii}. {summary_think}")
+
+        yield think + "</think>"
--- a/agentic_reasoning/prompts.py
+++ b/agentic_reasoning/prompts.py
@ -0,0 +1,112 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+BEGIN_SEARCH_QUERY = "<|begin_search_query|>"
+END_SEARCH_QUERY = "<|end_search_query|>"
+BEGIN_SEARCH_RESULT = "<|begin_search_result|>"
+END_SEARCH_RESULT = "<|end_search_result|>"
+MAX_SEARCH_LIMIT = 6
+
+REASON_PROMPT = (
+        "You are a reasoning assistant with the ability to perform dataset searches to help "
+        "you answer the user's question accurately. You have special tools:\n\n"
+        f"- To perform a search: write {BEGIN_SEARCH_QUERY} your query here {END_SEARCH_QUERY}.\n"
+        f"Then, the system will search and analyze relevant content, then provide you with helpful information in the format {BEGIN_SEARCH_RESULT} ...search results... {END_SEARCH_RESULT}.\n\n"
+        f"You can repeat the search process multiple times if necessary. The maximum number of search attempts is limited to {MAX_SEARCH_LIMIT}.\n\n"
+        "Once you have all the information you need, continue your reasoning.\n\n"
+        "-- Example 1 --\n" ########################################
+        "Question: \"Are both the directors of Jaws and Casino Royale from the same country?\"\n"
+        "Assistant:\n"
+        f"    {BEGIN_SEARCH_QUERY}Who is the director of Jaws?{END_SEARCH_QUERY}\n\n"
+        "User:\n"
+        f"    {BEGIN_SEARCH_RESULT}\nThe director of Jaws is Steven Spielberg...\n{END_SEARCH_RESULT}\n\n"
+        "Continues reasoning with the new information.\n"
+        "Assistant:\n"
+        f"    {BEGIN_SEARCH_QUERY}Where is Steven Spielberg from?{END_SEARCH_QUERY}\n\n"
+        "User:\n"
+        f"    {BEGIN_SEARCH_RESULT}\nSteven Allan Spielberg is an American filmmaker...\n{END_SEARCH_RESULT}\n\n"
+        "Continues reasoning with the new information...\n\n"
+        "Assistant:\n"
+        f"    {BEGIN_SEARCH_QUERY}Who is the director of Casino Royale?{END_SEARCH_QUERY}\n\n"
+        "User:\n"
+        f"    {BEGIN_SEARCH_RESULT}\nCasino Royale is a 2006 spy film directed by Martin Campbell...\n{END_SEARCH_RESULT}\n\n"
+        "Continues reasoning with the new information...\n\n"
+        "Assistant:\n"
+        f"    {BEGIN_SEARCH_QUERY}Where is Martin Campbell from?{END_SEARCH_QUERY}\n\n"
+        "User:\n"
+        f"    {BEGIN_SEARCH_RESULT}\nMartin Campbell (born 24 October 1943) is a New Zealand film and television director...\n{END_SEARCH_RESULT}\n\n"
+        "Continues reasoning with the new information...\n\n"
+        "Assistant:\nIt's enough to answer the question\n"
+
+        "-- Example 2 --\n" #########################################
+        "Question: \"When was the founder of craigslist born?\"\n"
+        "Assistant:\n"
+        f"    {BEGIN_SEARCH_QUERY}Who was the founder of craigslist?{END_SEARCH_QUERY}\n\n"
+        "User:\n"
+        f"    {BEGIN_SEARCH_RESULT}\nCraigslist was founded by Craig Newmark...\n{END_SEARCH_RESULT}\n\n"
+        "Continues reasoning with the new information.\n"
+        "Assistant:\n"
+        f"    {BEGIN_SEARCH_QUERY} When was Craig Newmark born?{END_SEARCH_QUERY}\n\n"
+        "User:\n"
+        f"    {BEGIN_SEARCH_RESULT}\nCraig Newmark was born on December 6, 1952...\n{END_SEARCH_RESULT}\n\n"
+        "Continues reasoning with the new information...\n\n"
+        "Assistant:\nIt's enough to answer the question\n"
+        "**Remember**:\n"
+        f"- You have a dataset to search, so you just provide a proper search query.\n"
+        f"- Use {BEGIN_SEARCH_QUERY} to request a dataset search and end with {END_SEARCH_QUERY}.\n"
+        "- The language of query MUST be as the same as 'Question' or 'search result'.\n"
+        "- When done searching, continue your reasoning.\n\n"
+        'Please answer the following question. You should think step by step to solve it.\n\n'
+    )
+
+RELEVANT_EXTRACTION_PROMPT = """**Task Instruction:**
+
+    You are tasked with reading and analyzing web pages based on the following inputs: **Previous Reasoning Steps**, **Current Search Query**, and **Searched Web Pages**. Your objective is to extract relevant and helpful information for **Current Search Query** from the **Searched Web Pages** and seamlessly integrate this information into the **Previous Reasoning Steps** to continue reasoning for the original question.
+
+    **Guidelines:**
+
+    1. **Analyze the Searched Web Pages:**
+    - Carefully review the content of each searched web page.
+    - Identify factual information that is relevant to the **Current Search Query** and can aid in the reasoning process for the original question.
+
+    2. **Extract Relevant Information:**
+    - Select the information from the Searched Web Pages that directly contributes to advancing the **Previous Reasoning Steps**.
+    - Ensure that the extracted information is accurate and relevant.
+
+    3. **Output Format:**
+    - **If the web pages provide helpful information for current search query:** Present the information beginning with `**Final Information**` as shown below.
+    - The language of query **MUST BE** as the same as 'Search Query' or 'Web Pages'.\n"
+    **Final Information**
+
+    [Helpful information]
+
+    - **If the web pages do not provide any helpful information for current search query:** Output the following text.
+
+    **Final Information**
+
+    No helpful information found.
+
+    **Inputs:**
+    - **Previous Reasoning Steps:**  
+    {prev_reasoning}
+
+    - **Current Search Query:**  
+    {search_query}
+
+    - **Searched Web Pages:**  
+    {document}
+
+    """
--- a/api/init.py
+++ b/api/init.py
@ -0,0 +1,18 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from beartype.claw import beartype_this_package
+beartype_this_package()
--- a/api/apps/init.py
+++ b/api/apps/init.py
@ -0,0 +1,165 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import os
+import sys
+import logging
+from importlib.util import module_from_spec, spec_from_file_location
+from pathlib import Path
+from flask import Blueprint, Flask
+from werkzeug.wrappers.request import Request
+from flask_cors import CORS
+from flasgger import Swagger
+from itsdangerous.url_safe import URLSafeTimedSerializer as Serializer
+
+from api.db import StatusEnum
+from api.db.db_models import close_connection
+from api.db.services import UserService
+from api.utils import CustomJSONEncoder, commands
+
+from flask_session import Session
+from flask_login import LoginManager
+from api import settings
+from api.utils.api_utils import server_error_response
+from api.constants import API_VERSION
+
+__all__ = ["app"]
+
+Request.json = property(lambda self: self.get_json(force=True, silent=True))
+
+app = Flask(__name__)
+
+# Add this at the beginning of your file to configure Swagger UI
+swagger_config = {
+    "headers": [],
+    "specs": [
+        {
+            "endpoint": "apispec",
+            "route": "/apispec.json",
+            "rule_filter": lambda rule: True,  # Include all endpoints
+            "model_filter": lambda tag: True,  # Include all models
+        }
+    ],
+    "static_url_path": "/flasgger_static",
+    "swagger_ui": True,
+    "specs_route": "/apidocs/",
+}
+
+swagger = Swagger(
+    app,
+    config=swagger_config,
+    template={
+        "swagger": "2.0",
+        "info": {
+            "title": "RAGFlow API",
+            "description": "",
+            "version": "1.0.0",
+        },
+        "securityDefinitions": {
+            "ApiKeyAuth": {"type": "apiKey", "name": "Authorization", "in": "header"}
+        },
+    },
+)
+
+CORS(app, supports_credentials=True, max_age=2592000)
+app.url_map.strict_slashes = False
+app.json_encoder = CustomJSONEncoder
+app.errorhandler(Exception)(server_error_response)
+
+## convince for dev and debug
+# app.config["LOGIN_DISABLED"] = True
+app.config["SESSION_PERMANENT"] = False
+app.config["SESSION_TYPE"] = "filesystem"
+app.config["MAX_CONTENT_LENGTH"] = int(
+    os.environ.get("MAX_CONTENT_LENGTH", 128 * 1024 * 1024)
+)
+
+Session(app)
+login_manager = LoginManager()
+login_manager.init_app(app)
+
+commands.register_commands(app)
+
+
+def search_pages_path(pages_dir):
+    app_path_list = [
+        path for path in pages_dir.glob("*_app.py") if not path.name.startswith(".")
+    ]
+    api_path_list = [
+        path for path in pages_dir.glob("*sdk/*.py") if not path.name.startswith(".")
+    ]
+    app_path_list.extend(api_path_list)
+    return app_path_list
+
+
+def register_page(page_path):
+    path = f"{page_path}"
+
+    page_name = page_path.stem.rstrip("_app")
+    module_name = ".".join(
+        page_path.parts[page_path.parts.index("api"): -1] + (page_name,)
+    )
+
+    spec = spec_from_file_location(module_name, page_path)
+    page = module_from_spec(spec)
+    page.app = app
+    page.manager = Blueprint(page_name, module_name)
+    sys.modules[module_name] = page
+    spec.loader.exec_module(page)
+    page_name = getattr(page, "page_name", page_name)
+    sdk_path = "\\sdk\\" if sys.platform.startswith("win") else "/sdk/"
+    url_prefix = (
+        f"/api/{API_VERSION}" if sdk_path in path else f"/{API_VERSION}/{page_name}"
+    )
+
+    app.register_blueprint(page.manager, url_prefix=url_prefix)
+    return url_prefix
+
+
+pages_dir = [
+    Path(__file__).parent,
+    Path(__file__).parent.parent / "api" / "apps",
+    Path(__file__).parent.parent / "api" / "apps" / "sdk",
+]
+
+client_urls_prefix = [
+    register_page(path) for dir in pages_dir for path in search_pages_path(dir)
+]
+
+
+@login_manager.request_loader
+def load_user(web_request):
+    jwt = Serializer(secret_key=settings.SECRET_KEY)
+    authorization = web_request.headers.get("Authorization")
+    if authorization:
+        try:
+            access_token = str(jwt.loads(authorization))
+            user = UserService.query(
+                access_token=access_token, status=StatusEnum.VALID.value
+            )
+            if user:
+                return user[0]
+            else:
+                return None
+        except Exception as e:
+            logging.warning(f"load_user got exception {e}")
+            return None
+    else:
+        return None
+
+
+@app.teardown_request
+def _db_close(exc):
+    close_connection()
--- a/api/apps/api_app.py
+++ b/api/apps/api_app.py
@ -0,0 +1,854 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+import os
+import re
+from datetime import datetime, timedelta
+from flask import request, Response
+from api.db.services.llm_service import TenantLLMService
+from flask_login import login_required, current_user
+
+from api.db import FileType, LLMType, ParserType, FileSource
+from api.db.db_models import APIToken, Task, File
+from api.db.services import duplicate_name
+from api.db.services.api_service import APITokenService, API4ConversationService
+from api.db.services.dialog_service import DialogService, chat
+from api.db.services.document_service import DocumentService, doc_upload_and_parse
+from api.db.services.file2document_service import File2DocumentService
+from api.db.services.file_service import FileService
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.task_service import queue_tasks, TaskService
+from api.db.services.user_service import UserTenantService
+from api import settings
+from api.utils import get_uuid, current_timestamp, datetime_format
+from api.utils.api_utils import server_error_response, get_data_error_result, get_json_result, validate_request, \
+    generate_confirmation_token
+
+from api.utils.file_utils import filename_type, thumbnail
+from rag.app.tag import label_question
+from rag.prompts import keyword_extraction
+from rag.utils.storage_factory import STORAGE_IMPL
+
+from api.db.services.canvas_service import UserCanvasService
+from agent.canvas import Canvas
+from functools import partial
+
+
+@manager.route('/new_token', methods=['POST'])  # noqa: F821
+@login_required
+def new_token():
+    req = request.json
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+
+        tenant_id = tenants[0].tenant_id
+        obj = {"tenant_id": tenant_id, "token": generate_confirmation_token(tenant_id),
+               "create_time": current_timestamp(),
+               "create_date": datetime_format(datetime.now()),
+               "update_time": None,
+               "update_date": None
+               }
+        if req.get("canvas_id"):
+            obj["dialog_id"] = req["canvas_id"]
+            obj["source"] = "agent"
+        else:
+            obj["dialog_id"] = req["dialog_id"]
+
+        if not APITokenService.save(**obj):
+            return get_data_error_result(message="Fail to new a dialog!")
+
+        return get_json_result(data=obj)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/token_list', methods=['GET'])  # noqa: F821
+@login_required
+def token_list():
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+
+        id = request.args["dialog_id"] if "dialog_id" in request.args else request.args["canvas_id"]
+        objs = APITokenService.query(tenant_id=tenants[0].tenant_id, dialog_id=id)
+        return get_json_result(data=[o.to_dict() for o in objs])
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@validate_request("tokens", "tenant_id")
+@login_required
+def rm():
+    req = request.json
+    try:
+        for token in req["tokens"]:
+            APITokenService.filter_delete(
+                [APIToken.tenant_id == req["tenant_id"], APIToken.token == token])
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/stats', methods=['GET'])  # noqa: F821
+@login_required
+def stats():
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+        objs = API4ConversationService.stats(
+            tenants[0].tenant_id,
+            request.args.get(
+                "from_date",
+                (datetime.now() -
+                 timedelta(
+                     days=7)).strftime("%Y-%m-%d 00:00:00")),
+            request.args.get(
+                "to_date",
+                datetime.now().strftime("%Y-%m-%d %H:%M:%S")),
+            "agent" if "canvas_id" in request.args else None)
+        res = {
+            "pv": [(o["dt"], o["pv"]) for o in objs],
+            "uv": [(o["dt"], o["uv"]) for o in objs],
+            "speed": [(o["dt"], float(o["tokens"]) / (float(o["duration"] + 0.1))) for o in objs],
+            "tokens": [(o["dt"], float(o["tokens"]) / 1000.) for o in objs],
+            "round": [(o["dt"], o["round"]) for o in objs],
+            "thumb_up": [(o["dt"], o["thumb_up"]) for o in objs]
+        }
+        return get_json_result(data=res)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/new_conversation', methods=['GET'])  # noqa: F821
+def set_conversation():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+    try:
+        if objs[0].source == "agent":
+            e, cvs = UserCanvasService.get_by_id(objs[0].dialog_id)
+            if not e:
+                return server_error_response("canvas not found.")
+            if not isinstance(cvs.dsl, str):
+                cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
+            canvas = Canvas(cvs.dsl, objs[0].tenant_id)
+            conv = {
+                "id": get_uuid(),
+                "dialog_id": cvs.id,
+                "user_id": request.args.get("user_id", ""),
+                "message": [{"role": "assistant", "content": canvas.get_prologue()}],
+                "source": "agent"
+            }
+            API4ConversationService.save(**conv)
+            return get_json_result(data=conv)
+        else:
+            e, dia = DialogService.get_by_id(objs[0].dialog_id)
+            if not e:
+                return get_data_error_result(message="Dialog not found")
+            conv = {
+                "id": get_uuid(),
+                "dialog_id": dia.id,
+                "user_id": request.args.get("user_id", ""),
+                "message": [{"role": "assistant", "content": dia.prompt_config["prologue"]}]
+            }
+            API4ConversationService.save(**conv)
+            return get_json_result(data=conv)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/completion', methods=['POST'])  # noqa: F821
+@validate_request("conversation_id", "messages")
+def completion():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+    req = request.json
+    e, conv = API4ConversationService.get_by_id(req["conversation_id"])
+    if not e:
+        return get_data_error_result(message="Conversation not found!")
+    if "quote" not in req:
+        req["quote"] = False
+
+    msg = []
+    for m in req["messages"]:
+        if m["role"] == "system":
+            continue
+        if m["role"] == "assistant" and not msg:
+            continue
+        msg.append(m)
+    if not msg[-1].get("id"):
+        msg[-1]["id"] = get_uuid()
+    message_id = msg[-1]["id"]
+
+    def fillin_conv(ans):
+        nonlocal conv, message_id
+        if not conv.reference:
+            conv.reference.append(ans["reference"])
+        else:
+            conv.reference[-1] = ans["reference"]
+        conv.message[-1] = {"role": "assistant", "content": ans["answer"], "id": message_id}
+        ans["id"] = message_id
+
+    def rename_field(ans):
+        reference = ans['reference']
+        if not isinstance(reference, dict):
+            return
+        for chunk_i in reference.get('chunks', []):
+            if 'docnm_kwd' in chunk_i:
+                chunk_i['doc_name'] = chunk_i['docnm_kwd']
+                chunk_i.pop('docnm_kwd')
+
+    try:
+        if conv.source == "agent":
+            stream = req.get("stream", True)
+            conv.message.append(msg[-1])
+            e, cvs = UserCanvasService.get_by_id(conv.dialog_id)
+            if not e:
+                return server_error_response("canvas not found.")
+            del req["conversation_id"]
+            del req["messages"]
+
+            if not isinstance(cvs.dsl, str):
+                cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
+
+            if not conv.reference:
+                conv.reference = []
+            conv.message.append({"role": "assistant", "content": "", "id": message_id})
+            conv.reference.append({"chunks": [], "doc_aggs": []})
+
+            final_ans = {"reference": [], "content": ""}
+            canvas = Canvas(cvs.dsl, objs[0].tenant_id)
+
+            canvas.messages.append(msg[-1])
+            canvas.add_user_input(msg[-1]["content"])
+            answer = canvas.run(stream=stream)
+
+            assert answer is not None, "Nothing. Is it over?"
+
+            if stream:
+                assert isinstance(answer, partial), "Nothing. Is it over?"
+
+                def sse():
+                    nonlocal answer, cvs, conv
+                    try:
+                        for ans in answer():
+                            for k in ans.keys():
+                                final_ans[k] = ans[k]
+                            ans = {"answer": ans["content"], "reference": ans.get("reference", [])}
+                            fillin_conv(ans)
+                            rename_field(ans)
+                            yield "data:" + json.dumps({"code": 0, "message": "", "data": ans},
+                                                       ensure_ascii=False) + "\n\n"
+
+                        canvas.messages.append({"role": "assistant", "content": final_ans["content"], "id": message_id})
+                        canvas.history.append(("assistant", final_ans["content"]))
+                        if final_ans.get("reference"):
+                            canvas.reference.append(final_ans["reference"])
+                        cvs.dsl = json.loads(str(canvas))
+                        API4ConversationService.append_message(conv.id, conv.to_dict())
+                    except Exception as e:
+                        yield "data:" + json.dumps({"code": 500, "message": str(e),
+                                                    "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
+                                                   ensure_ascii=False) + "\n\n"
+                    yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
+
+                resp = Response(sse(), mimetype="text/event-stream")
+                resp.headers.add_header("Cache-control", "no-cache")
+                resp.headers.add_header("Connection", "keep-alive")
+                resp.headers.add_header("X-Accel-Buffering", "no")
+                resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+                return resp
+
+            final_ans["content"] = "\n".join(answer["content"]) if "content" in answer else ""
+            canvas.messages.append({"role": "assistant", "content": final_ans["content"], "id": message_id})
+            if final_ans.get("reference"):
+                canvas.reference.append(final_ans["reference"])
+            cvs.dsl = json.loads(str(canvas))
+
+            result = {"answer": final_ans["content"], "reference": final_ans.get("reference", [])}
+            fillin_conv(result)
+            API4ConversationService.append_message(conv.id, conv.to_dict())
+            rename_field(result)
+            return get_json_result(data=result)
+
+        # ******************For dialog******************
+        conv.message.append(msg[-1])
+        e, dia = DialogService.get_by_id(conv.dialog_id)
+        if not e:
+            return get_data_error_result(message="Dialog not found!")
+        del req["conversation_id"]
+        del req["messages"]
+
+        if not conv.reference:
+            conv.reference = []
+        conv.message.append({"role": "assistant", "content": "", "id": message_id})
+        conv.reference.append({"chunks": [], "doc_aggs": []})
+
+        def stream():
+            nonlocal dia, msg, req, conv
+            try:
+                for ans in chat(dia, msg, True, **req):
+                    fillin_conv(ans)
+                    rename_field(ans)
+                    yield "data:" + json.dumps({"code": 0, "message": "", "data": ans},
+                                               ensure_ascii=False) + "\n\n"
+                API4ConversationService.append_message(conv.id, conv.to_dict())
+            except Exception as e:
+                yield "data:" + json.dumps({"code": 500, "message": str(e),
+                                            "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
+                                           ensure_ascii=False) + "\n\n"
+            yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
+
+        if req.get("stream", True):
+            resp = Response(stream(), mimetype="text/event-stream")
+            resp.headers.add_header("Cache-control", "no-cache")
+            resp.headers.add_header("Connection", "keep-alive")
+            resp.headers.add_header("X-Accel-Buffering", "no")
+            resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+            return resp
+
+        answer = None
+        for ans in chat(dia, msg, **req):
+            answer = ans
+            fillin_conv(ans)
+            API4ConversationService.append_message(conv.id, conv.to_dict())
+            break
+        rename_field(answer)
+        return get_json_result(data=answer)
+
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/conversation/<conversation_id>', methods=['GET'])  # noqa: F821
+# @login_required
+def get(conversation_id):
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    try:
+        e, conv = API4ConversationService.get_by_id(conversation_id)
+        if not e:
+            return get_data_error_result(message="Conversation not found!")
+
+        conv = conv.to_dict()
+        if token != APIToken.query(dialog_id=conv['dialog_id'])[0].token:
+            return get_json_result(data=False, message='Authentication error: API key is invalid for this conversation_id!"',
+                                   code=settings.RetCode.AUTHENTICATION_ERROR)
+
+        for referenct_i in conv['reference']:
+            if referenct_i is None or len(referenct_i) == 0:
+                continue
+            for chunk_i in referenct_i['chunks']:
+                if 'docnm_kwd' in chunk_i.keys():
+                    chunk_i['doc_name'] = chunk_i['docnm_kwd']
+                    chunk_i.pop('docnm_kwd')
+        return get_json_result(data=conv)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/document/upload', methods=['POST'])  # noqa: F821
+@validate_request("kb_name")
+def upload():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    kb_name = request.form.get("kb_name").strip()
+    tenant_id = objs[0].tenant_id
+
+    try:
+        e, kb = KnowledgebaseService.get_by_name(kb_name, tenant_id)
+        if not e:
+            return get_data_error_result(
+                message="Can't find this knowledgebase!")
+        kb_id = kb.id
+    except Exception as e:
+        return server_error_response(e)
+
+    if 'file' not in request.files:
+        return get_json_result(
+            data=False, message='No file part!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    file = request.files['file']
+    if file.filename == '':
+        return get_json_result(
+            data=False, message='No file selected!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    root_folder = FileService.get_root_folder(tenant_id)
+    pf_id = root_folder["id"]
+    FileService.init_knowledgebase_docs(pf_id, tenant_id)
+    kb_root_folder = FileService.get_kb_folder(tenant_id)
+    kb_folder = FileService.new_a_file_from_kb(kb.tenant_id, kb.name, kb_root_folder["id"])
+
+    try:
+        if DocumentService.get_doc_count(kb.tenant_id) >= int(os.environ.get('MAX_FILE_NUM_PER_USER', 8192)):
+            return get_data_error_result(
+                message="Exceed the maximum file number of a free user!")
+
+        filename = duplicate_name(
+            DocumentService.query,
+            name=file.filename,
+            kb_id=kb_id)
+        filetype = filename_type(filename)
+        if not filetype:
+            return get_data_error_result(
+                message="This type of file has not been supported yet!")
+
+        location = filename
+        while STORAGE_IMPL.obj_exist(kb_id, location):
+            location += "_"
+        blob = request.files['file'].read()
+        STORAGE_IMPL.put(kb_id, location, blob)
+        doc = {
+            "id": get_uuid(),
+            "kb_id": kb.id,
+            "parser_id": kb.parser_id,
+            "parser_config": kb.parser_config,
+            "created_by": kb.tenant_id,
+            "type": filetype,
+            "name": filename,
+            "location": location,
+            "size": len(blob),
+            "thumbnail": thumbnail(filename, blob)
+        }
+
+        form_data = request.form
+        if "parser_id" in form_data.keys():
+            if request.form.get("parser_id").strip() in list(vars(ParserType).values())[1:-3]:
+                doc["parser_id"] = request.form.get("parser_id").strip()
+        if doc["type"] == FileType.VISUAL:
+            doc["parser_id"] = ParserType.PICTURE.value
+        if doc["type"] == FileType.AURAL:
+            doc["parser_id"] = ParserType.AUDIO.value
+        if re.search(r"\.(ppt|pptx|pages)$", filename):
+            doc["parser_id"] = ParserType.PRESENTATION.value
+        if re.search(r"\.(eml)$", filename):
+            doc["parser_id"] = ParserType.EMAIL.value
+
+        doc_result = DocumentService.insert(doc)
+        FileService.add_file_from_kb(doc, kb_folder["id"], kb.tenant_id)
+    except Exception as e:
+        return server_error_response(e)
+
+    if "run" in form_data.keys():
+        if request.form.get("run").strip() == "1":
+            try:
+                info = {"run": 1, "progress": 0}
+                info["progress_msg"] = ""
+                info["chunk_num"] = 0
+                info["token_num"] = 0
+                DocumentService.update_by_id(doc["id"], info)
+                # if str(req["run"]) == TaskStatus.CANCEL.value:
+                tenant_id = DocumentService.get_tenant_id(doc["id"])
+                if not tenant_id:
+                    return get_data_error_result(message="Tenant not found!")
+
+                # e, doc = DocumentService.get_by_id(doc["id"])
+                TaskService.filter_delete([Task.doc_id == doc["id"]])
+                e, doc = DocumentService.get_by_id(doc["id"])
+                doc = doc.to_dict()
+                doc["tenant_id"] = tenant_id
+                bucket, name = File2DocumentService.get_storage_address(doc_id=doc["id"])
+                queue_tasks(doc, bucket, name)
+            except Exception as e:
+                return server_error_response(e)
+
+    return get_json_result(data=doc_result.to_json())
+
+
+@manager.route('/document/upload_and_parse', methods=['POST'])  # noqa: F821
+@validate_request("conversation_id")
+def upload_parse():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    if 'file' not in request.files:
+        return get_json_result(
+            data=False, message='No file part!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    file_objs = request.files.getlist('file')
+    for file_obj in file_objs:
+        if file_obj.filename == '':
+            return get_json_result(
+                data=False, message='No file selected!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    doc_ids = doc_upload_and_parse(request.form.get("conversation_id"), file_objs, objs[0].tenant_id)
+    return get_json_result(data=doc_ids)
+
+
+@manager.route('/list_chunks', methods=['POST'])  # noqa: F821
+# @login_required
+def list_chunks():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    req = request.json
+
+    try:
+        if "doc_name" in req.keys():
+            tenant_id = DocumentService.get_tenant_id_by_name(req['doc_name'])
+            doc_id = DocumentService.get_doc_id_by_doc_name(req['doc_name'])
+
+        elif "doc_id" in req.keys():
+            tenant_id = DocumentService.get_tenant_id(req['doc_id'])
+            doc_id = req['doc_id']
+        else:
+            return get_json_result(
+                data=False, message="Can't find doc_name or doc_id"
+            )
+        kb_ids = KnowledgebaseService.get_kb_ids(tenant_id)
+
+        res = settings.retrievaler.chunk_list(doc_id, tenant_id, kb_ids)
+        res = [
+            {
+                "content": res_item["content_with_weight"],
+                "doc_name": res_item["docnm_kwd"],
+                "image_id": res_item["img_id"]
+            } for res_item in res
+        ]
+
+    except Exception as e:
+        return server_error_response(e)
+
+    return get_json_result(data=res)
+
+
+@manager.route('/list_kb_docs', methods=['POST'])  # noqa: F821
+# @login_required
+def list_kb_docs():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    req = request.json
+    tenant_id = objs[0].tenant_id
+    kb_name = req.get("kb_name", "").strip()
+
+    try:
+        e, kb = KnowledgebaseService.get_by_name(kb_name, tenant_id)
+        if not e:
+            return get_data_error_result(
+                message="Can't find this knowledgebase!")
+        kb_id = kb.id
+
+    except Exception as e:
+        return server_error_response(e)
+
+    page_number = int(req.get("page", 1))
+    items_per_page = int(req.get("page_size", 15))
+    orderby = req.get("orderby", "create_time")
+    desc = req.get("desc", True)
+    keywords = req.get("keywords", "")
+
+    try:
+        docs, tol = DocumentService.get_by_kb_id(
+            kb_id, page_number, items_per_page, orderby, desc, keywords)
+        docs = [{"doc_id": doc['id'], "doc_name": doc['name']} for doc in docs]
+
+        return get_json_result(data={"total": tol, "docs": docs})
+
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/document/infos', methods=['POST'])  # noqa: F821
+@validate_request("doc_ids")
+def docinfos():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+    req = request.json
+    doc_ids = req["doc_ids"]
+    docs = DocumentService.get_by_ids(doc_ids)
+    return get_json_result(data=list(docs.dicts()))
+
+
+@manager.route('/document', methods=['DELETE'])  # noqa: F821
+# @login_required
+def document_rm():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    tenant_id = objs[0].tenant_id
+    req = request.json
+    try:
+        doc_ids = [DocumentService.get_doc_id_by_doc_name(doc_name) for doc_name in req.get("doc_names", [])]
+        for doc_id in req.get("doc_ids", []):
+            if doc_id not in doc_ids:
+                doc_ids.append(doc_id)
+
+        if not doc_ids:
+            return get_json_result(
+                data=False, message="Can't find doc_names or doc_ids"
+            )
+
+    except Exception as e:
+        return server_error_response(e)
+
+    root_folder = FileService.get_root_folder(tenant_id)
+    pf_id = root_folder["id"]
+    FileService.init_knowledgebase_docs(pf_id, tenant_id)
+
+    errors = ""
+    for doc_id in doc_ids:
+        try:
+            e, doc = DocumentService.get_by_id(doc_id)
+            if not e:
+                return get_data_error_result(message="Document not found!")
+            tenant_id = DocumentService.get_tenant_id(doc_id)
+            if not tenant_id:
+                return get_data_error_result(message="Tenant not found!")
+
+            b, n = File2DocumentService.get_storage_address(doc_id=doc_id)
+
+            if not DocumentService.remove_document(doc, tenant_id):
+                return get_data_error_result(
+                    message="Database error (Document removal)!")
+
+            f2d = File2DocumentService.get_by_document_id(doc_id)
+            FileService.filter_delete([File.source_type == FileSource.KNOWLEDGEBASE, File.id == f2d[0].file_id])
+            File2DocumentService.delete_by_document_id(doc_id)
+
+            STORAGE_IMPL.rm(b, n)
+        except Exception as e:
+            errors += str(e)
+
+    if errors:
+        return get_json_result(data=False, message=errors, code=settings.RetCode.SERVER_ERROR)
+
+    return get_json_result(data=True)
+
+
+@manager.route('/completion_aibotk', methods=['POST'])  # noqa: F821
+@validate_request("Authorization", "conversation_id", "word")
+def completion_faq():
+    import base64
+    req = request.json
+
+    token = req["Authorization"]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    e, conv = API4ConversationService.get_by_id(req["conversation_id"])
+    if not e:
+        return get_data_error_result(message="Conversation not found!")
+    if "quote" not in req:
+        req["quote"] = True
+
+    msg = []
+    msg.append({"role": "user", "content": req["word"]})
+    if not msg[-1].get("id"):
+        msg[-1]["id"] = get_uuid()
+    message_id = msg[-1]["id"]
+
+    def fillin_conv(ans):
+        nonlocal conv, message_id
+        if not conv.reference:
+            conv.reference.append(ans["reference"])
+        else:
+            conv.reference[-1] = ans["reference"]
+        conv.message[-1] = {"role": "assistant", "content": ans["answer"], "id": message_id}
+        ans["id"] = message_id
+
+    try:
+        if conv.source == "agent":
+            conv.message.append(msg[-1])
+            e, cvs = UserCanvasService.get_by_id(conv.dialog_id)
+            if not e:
+                return server_error_response("canvas not found.")
+
+            if not isinstance(cvs.dsl, str):
+                cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
+
+            if not conv.reference:
+                conv.reference = []
+            conv.message.append({"role": "assistant", "content": "", "id": message_id})
+            conv.reference.append({"chunks": [], "doc_aggs": []})
+
+            final_ans = {"reference": [], "doc_aggs": []}
+            canvas = Canvas(cvs.dsl, objs[0].tenant_id)
+
+            canvas.messages.append(msg[-1])
+            canvas.add_user_input(msg[-1]["content"])
+            answer = canvas.run(stream=False)
+
+            assert answer is not None, "Nothing. Is it over?"
+
+            data_type_picture = {
+                "type": 3,
+                "url": "base64 content"
+            }
+            data = [
+                {
+                    "type": 1,
+                    "content": ""
+                }
+            ]
+            final_ans["content"] = "\n".join(answer["content"]) if "content" in answer else ""
+            canvas.messages.append({"role": "assistant", "content": final_ans["content"], "id": message_id})
+            if final_ans.get("reference"):
+                canvas.reference.append(final_ans["reference"])
+            cvs.dsl = json.loads(str(canvas))
+
+            ans = {"answer": final_ans["content"], "reference": final_ans.get("reference", [])}
+            data[0]["content"] += re.sub(r'##\d\$\$', '', ans["answer"])
+            fillin_conv(ans)
+            API4ConversationService.append_message(conv.id, conv.to_dict())
+
+            chunk_idxs = [int(match[2]) for match in re.findall(r'##\d\$\$', ans["answer"])]
+            for chunk_idx in chunk_idxs[:1]:
+                if ans["reference"]["chunks"][chunk_idx]["img_id"]:
+                    try:
+                        bkt, nm = ans["reference"]["chunks"][chunk_idx]["img_id"].split("-")
+                        response = STORAGE_IMPL.get(bkt, nm)
+                        data_type_picture["url"] = base64.b64encode(response).decode('utf-8')
+                        data.append(data_type_picture)
+                        break
+                    except Exception as e:
+                        return server_error_response(e)
+
+            response = {"code": 200, "msg": "success", "data": data}
+            return response
+
+        # ******************For dialog******************
+        conv.message.append(msg[-1])
+        e, dia = DialogService.get_by_id(conv.dialog_id)
+        if not e:
+            return get_data_error_result(message="Dialog not found!")
+        del req["conversation_id"]
+
+        if not conv.reference:
+            conv.reference = []
+        conv.message.append({"role": "assistant", "content": "", "id": message_id})
+        conv.reference.append({"chunks": [], "doc_aggs": []})
+
+        data_type_picture = {
+            "type": 3,
+            "url": "base64 content"
+        }
+        data = [
+            {
+                "type": 1,
+                "content": ""
+            }
+        ]
+        ans = ""
+        for a in chat(dia, msg, stream=False, **req):
+            ans = a
+            break
+        data[0]["content"] += re.sub(r'##\d\$\$', '', ans["answer"])
+        fillin_conv(ans)
+        API4ConversationService.append_message(conv.id, conv.to_dict())
+
+        chunk_idxs = [int(match[2]) for match in re.findall(r'##\d\$\$', ans["answer"])]
+        for chunk_idx in chunk_idxs[:1]:
+            if ans["reference"]["chunks"][chunk_idx]["img_id"]:
+                try:
+                    bkt, nm = ans["reference"]["chunks"][chunk_idx]["img_id"].split("-")
+                    response = STORAGE_IMPL.get(bkt, nm)
+                    data_type_picture["url"] = base64.b64encode(response).decode('utf-8')
+                    data.append(data_type_picture)
+                    break
+                except Exception as e:
+                    return server_error_response(e)
+
+        response = {"code": 200, "msg": "success", "data": data}
+        return response
+
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/retrieval', methods=['POST'])  # noqa: F821
+@validate_request("kb_id", "question")
+def retrieval():
+    token = request.headers.get('Authorization').split()[1]
+    objs = APIToken.query(token=token)
+    if not objs:
+        return get_json_result(
+            data=False, message='Authentication error: API key is invalid!"', code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    req = request.json
+    kb_ids = req.get("kb_id", [])
+    doc_ids = req.get("doc_ids", [])
+    question = req.get("question")
+    page = int(req.get("page", 1))
+    size = int(req.get("size", 30))
+    similarity_threshold = float(req.get("similarity_threshold", 0.2))
+    vector_similarity_weight = float(req.get("vector_similarity_weight", 0.3))
+    top = int(req.get("top_k", 1024))
+
+    try:
+        kbs = KnowledgebaseService.get_by_ids(kb_ids)
+        embd_nms = list(set([kb.embd_id for kb in kbs]))
+        if len(embd_nms) != 1:
+            return get_json_result(
+                data=False, message='Knowledge bases use different embedding models or does not exist."',
+                code=settings.RetCode.AUTHENTICATION_ERROR)
+
+        embd_mdl = TenantLLMService.model_instance(
+            kbs[0].tenant_id, LLMType.EMBEDDING.value, llm_name=kbs[0].embd_id)
+        rerank_mdl = None
+        if req.get("rerank_id"):
+            rerank_mdl = TenantLLMService.model_instance(
+                kbs[0].tenant_id, LLMType.RERANK.value, llm_name=req["rerank_id"])
+        if req.get("keyword", False):
+            chat_mdl = TenantLLMService.model_instance(kbs[0].tenant_id, LLMType.CHAT)
+            question += keyword_extraction(chat_mdl, question)
+        ranks = settings.retrievaler.retrieval(question, embd_mdl, kbs[0].tenant_id, kb_ids, page, size,
+                                               similarity_threshold, vector_similarity_weight, top,
+                                               doc_ids, rerank_mdl=rerank_mdl,
+                                               rank_feature=label_question(question, kbs))
+        for c in ranks["chunks"]:
+            c.pop("vector", None)
+        return get_json_result(data=ranks)
+    except Exception as e:
+        if str(e).find("not_found") > 0:
+            return get_json_result(data=False, message='No chunk found! Check the chunk status please!',
+                                   code=settings.RetCode.DATA_ERROR)
+        return server_error_response(e)
--- a/api/apps/canvas_app.py
+++ b/api/apps/canvas_app.py
@ -0,0 +1,286 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+import traceback
+from flask import request, Response
+from flask_login import login_required, current_user
+from api.db.services.canvas_service import CanvasTemplateService, UserCanvasService
+from api.settings import RetCode
+from api.utils import get_uuid
+from api.utils.api_utils import get_json_result, server_error_response, validate_request, get_data_error_result
+from agent.canvas import Canvas
+from peewee import MySQLDatabase, PostgresqlDatabase
+from api.db.db_models import APIToken
+
+
+@manager.route('/templates', methods=['GET'])  # noqa: F821
+@login_required
+def templates():
+    return get_json_result(data=[c.to_dict() for c in CanvasTemplateService.get_all()])
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def canvas_list():
+    return get_json_result(data=sorted([c.to_dict() for c in \
+                                 UserCanvasService.query(user_id=current_user.id)], key=lambda x: x["update_time"]*-1)
+                           )
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@validate_request("canvas_ids")
+@login_required
+def rm():
+    for i in request.json["canvas_ids"]:
+        if not UserCanvasService.query(user_id=current_user.id,id=i):
+            return get_json_result(
+                data=False, message='Only owner of canvas authorized for this operation.',
+                code=RetCode.OPERATING_ERROR)
+        UserCanvasService.delete_by_id(i)
+    return get_json_result(data=True)
+
+
+@manager.route('/set', methods=['POST'])  # noqa: F821
+@validate_request("dsl", "title")
+@login_required
+def save():
+    req = request.json
+    req["user_id"] = current_user.id
+    if not isinstance(req["dsl"], str):
+        req["dsl"] = json.dumps(req["dsl"], ensure_ascii=False)
+
+    req["dsl"] = json.loads(req["dsl"])
+    if "id" not in req:
+        if UserCanvasService.query(user_id=current_user.id, title=req["title"].strip()):
+            return get_data_error_result(message=f"{req['title'].strip()} already exists.")
+        req["id"] = get_uuid()
+        if not UserCanvasService.save(**req):
+            return get_data_error_result(message="Fail to save canvas.")
+    else:
+        if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
+            return get_json_result(
+                data=False, message='Only owner of canvas authorized for this operation.',
+                code=RetCode.OPERATING_ERROR)
+        UserCanvasService.update_by_id(req["id"], req)
+    return get_json_result(data=req)
+
+
+@manager.route('/get/<canvas_id>', methods=['GET'])  # noqa: F821
+@login_required
+def get(canvas_id):
+    e, c = UserCanvasService.get_by_id(canvas_id)
+    if not e:
+        return get_data_error_result(message="canvas not found.")
+    return get_json_result(data=c.to_dict())
+
+@manager.route('/getsse/<canvas_id>', methods=['GET'])  # type: ignore # noqa: F821
+def getsse(canvas_id):
+    token = request.headers.get('Authorization').split()
+    if len(token) != 2:
+        return get_data_error_result(message='Authorization is not valid!"')
+    token = token[1]
+    objs = APIToken.query(beta=token)
+    if not objs:
+        return get_data_error_result(message='Authentication error: API key is invalid!"')
+    e, c = UserCanvasService.get_by_id(canvas_id)
+    if not e:
+        return get_data_error_result(message="canvas not found.")
+    return get_json_result(data=c.to_dict())
+
+
+@manager.route('/completion', methods=['POST'])  # noqa: F821
+@validate_request("id")
+@login_required
+def run():
+    req = request.json
+    stream = req.get("stream", True)
+    e, cvs = UserCanvasService.get_by_id(req["id"])
+    if not e:
+        return get_data_error_result(message="canvas not found.")
+    if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
+        return get_json_result(
+            data=False, message='Only owner of canvas authorized for this operation.',
+            code=RetCode.OPERATING_ERROR)
+
+    if not isinstance(cvs.dsl, str):
+        cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
+
+    final_ans = {"reference": [], "content": ""}
+    message_id = req.get("message_id", get_uuid())
+    try:
+        canvas = Canvas(cvs.dsl, current_user.id)
+        if "message" in req:
+            canvas.messages.append({"role": "user", "content": req["message"], "id": message_id})
+            canvas.add_user_input(req["message"])
+    except Exception as e:
+        return server_error_response(e)
+
+    if stream:
+        def sse():
+            nonlocal answer, cvs
+            try:
+                for ans in canvas.run(stream=True):
+                    if ans.get("running_status"):
+                        yield "data:" + json.dumps({"code": 0, "message": "",
+                                                    "data": {"answer": ans["content"],
+                                                             "running_status": True}},
+                                                   ensure_ascii=False) + "\n\n"
+                        continue
+                    for k in ans.keys():
+                        final_ans[k] = ans[k]
+                    ans = {"answer": ans["content"], "reference": ans.get("reference", [])}
+                    yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
+
+                canvas.messages.append({"role": "assistant", "content": final_ans["content"], "id": message_id})
+                canvas.history.append(("assistant", final_ans["content"]))
+                if not canvas.path[-1]:
+                    canvas.path.pop(-1)
+                if final_ans.get("reference"):
+                    canvas.reference.append(final_ans["reference"])
+                cvs.dsl = json.loads(str(canvas))
+                UserCanvasService.update_by_id(req["id"], cvs.to_dict())
+            except Exception as e:
+                cvs.dsl = json.loads(str(canvas))
+                if not canvas.path[-1]:
+                    canvas.path.pop(-1)
+                UserCanvasService.update_by_id(req["id"], cvs.to_dict())
+                traceback.print_exc()
+                yield "data:" + json.dumps({"code": 500, "message": str(e),
+                                            "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
+                                           ensure_ascii=False) + "\n\n"
+            yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
+
+        resp = Response(sse(), mimetype="text/event-stream")
+        resp.headers.add_header("Cache-control", "no-cache")
+        resp.headers.add_header("Connection", "keep-alive")
+        resp.headers.add_header("X-Accel-Buffering", "no")
+        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+        return resp
+
+    for answer in canvas.run(stream=False):
+        if answer.get("running_status"):
+            continue
+        final_ans["content"] = "\n".join(answer["content"]) if "content" in answer else ""
+        canvas.messages.append({"role": "assistant", "content": final_ans["content"], "id": message_id})
+        if final_ans.get("reference"):
+            canvas.reference.append(final_ans["reference"])
+        cvs.dsl = json.loads(str(canvas))
+        UserCanvasService.update_by_id(req["id"], cvs.to_dict())
+        return get_json_result(data={"answer": final_ans["content"], "reference": final_ans.get("reference", [])})
+
+
+@manager.route('/reset', methods=['POST'])  # noqa: F821
+@validate_request("id")
+@login_required
+def reset():
+    req = request.json
+    try:
+        e, user_canvas = UserCanvasService.get_by_id(req["id"])
+        if not e:
+            return get_data_error_result(message="canvas not found.")
+        if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
+            return get_json_result(
+                data=False, message='Only owner of canvas authorized for this operation.',
+                code=RetCode.OPERATING_ERROR)
+
+        canvas = Canvas(json.dumps(user_canvas.dsl), current_user.id)
+        canvas.reset()
+        req["dsl"] = json.loads(str(canvas))
+        UserCanvasService.update_by_id(req["id"], {"dsl": req["dsl"]})
+        return get_json_result(data=req["dsl"])
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/input_elements', methods=['GET'])  # noqa: F821
+@login_required
+def input_elements():
+    cvs_id = request.args.get("id")
+    cpn_id = request.args.get("component_id")
+    try:
+        e, user_canvas = UserCanvasService.get_by_id(cvs_id)
+        if not e:
+            return get_data_error_result(message="canvas not found.")
+        if not UserCanvasService.query(user_id=current_user.id, id=cvs_id):
+            return get_json_result(
+                data=False, message='Only owner of canvas authorized for this operation.',
+                code=RetCode.OPERATING_ERROR)
+
+        canvas = Canvas(json.dumps(user_canvas.dsl), current_user.id)
+        return get_json_result(data=canvas.get_component_input_elements(cpn_id))
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/debug', methods=['POST'])  # noqa: F821
+@validate_request("id", "component_id", "params")
+@login_required
+def debug():
+    req = request.json
+    for p in req["params"]:
+        assert p.get("key")
+    try:
+        e, user_canvas = UserCanvasService.get_by_id(req["id"])
+        if not e:
+            return get_data_error_result(message="canvas not found.")
+        if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
+            return get_json_result(
+                data=False, message='Only owner of canvas authorized for this operation.',
+                code=RetCode.OPERATING_ERROR)
+
+        canvas = Canvas(json.dumps(user_canvas.dsl), current_user.id)
+        canvas.get_component(req["component_id"])["obj"]._param.debug_inputs = req["params"]
+        df = canvas.get_component(req["component_id"])["obj"].debug()
+        return get_json_result(data=df.to_dict(orient="records"))
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/test_db_connect', methods=['POST'])  # noqa: F821
+@validate_request("db_type", "database", "username", "host", "port", "password")
+@login_required
+def test_db_connect():
+    req = request.json
+    try:
+        if req["db_type"] in ["mysql", "mariadb"]:
+            db = MySQLDatabase(req["database"], user=req["username"], host=req["host"], port=req["port"],
+                               password=req["password"])
+        elif req["db_type"] == 'postgresql':
+            db = PostgresqlDatabase(req["database"], user=req["username"], host=req["host"], port=req["port"],
+                                    password=req["password"])
+        elif req["db_type"] == 'mssql':
+            import pyodbc
+            connection_string = (
+                f"DRIVER={{ODBC Driver 17 for SQL Server}};"
+                f"SERVER={req['host']},{req['port']};"
+                f"DATABASE={req['database']};"
+                f"UID={req['username']};"
+                f"PWD={req['password']};"
+            )
+            db = pyodbc.connect(connection_string)
+            cursor = db.cursor()
+            cursor.execute("SELECT 1")
+            cursor.close()
+        else:
+            return server_error_response("Unsupported database type.")
+        if req["db_type"] != 'mssql':
+            db.connect()
+        db.close()
+        
+        return get_json_result(data="Database Connection Successful!")
+    except Exception as e:
+        return server_error_response(e)
+
--- a/api/apps/chunk_app.py
+++ b/api/apps/chunk_app.py
@ -0,0 +1,372 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import datetime
+import json
+
+from flask import request
+from flask_login import login_required, current_user
+
+from rag.app.qa import rmPrefix, beAdoc
+from rag.app.tag import label_question
+from rag.nlp import search, rag_tokenizer
+from rag.prompts import keyword_extraction
+from rag.settings import PAGERANK_FLD
+from rag.utils import rmSpace
+from api.db import LLMType, ParserType
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import LLMBundle
+from api.db.services.user_service import UserTenantService
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
+from api.db.services.document_service import DocumentService
+from api import settings
+from api.utils.api_utils import get_json_result
+import xxhash
+import re
+
+
+@manager.route('/list', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id")
+def list_chunk():
+    req = request.json
+    doc_id = req["doc_id"]
+    page = int(req.get("page", 1))
+    size = int(req.get("size", 30))
+    question = req.get("keywords", "")
+    try:
+        tenant_id = DocumentService.get_tenant_id(req["doc_id"])
+        if not tenant_id:
+            return get_data_error_result(message="Tenant not found!")
+        e, doc = DocumentService.get_by_id(doc_id)
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        kb_ids = KnowledgebaseService.get_kb_ids(tenant_id)
+        query = {
+            "doc_ids": [doc_id], "page": page, "size": size, "question": question, "sort": True
+        }
+        if "available_int" in req:
+            query["available_int"] = int(req["available_int"])
+        sres = settings.retrievaler.search(query, search.index_name(tenant_id), kb_ids, highlight=True)
+        res = {"total": sres.total, "chunks": [], "doc": doc.to_dict()}
+        for id in sres.ids:
+            d = {
+                "chunk_id": id,
+                "content_with_weight": rmSpace(sres.highlight[id]) if question and id in sres.highlight else sres.field[
+                    id].get(
+                    "content_with_weight", ""),
+                "doc_id": sres.field[id]["doc_id"],
+                "docnm_kwd": sres.field[id]["docnm_kwd"],
+                "important_kwd": sres.field[id].get("important_kwd", []),
+                "question_kwd": sres.field[id].get("question_kwd", []),
+                "image_id": sres.field[id].get("img_id", ""),
+                "available_int": int(sres.field[id].get("available_int", 1)),
+                "positions": sres.field[id].get("position_int", []),
+            }
+            assert isinstance(d["positions"], list)
+            assert len(d["positions"]) == 0 or (isinstance(d["positions"][0], list) and len(d["positions"][0]) == 5)
+            res["chunks"].append(d)
+        return get_json_result(data=res)
+    except Exception as e:
+        if str(e).find("not_found") > 0:
+            return get_json_result(data=False, message='No chunk found!',
+                                   code=settings.RetCode.DATA_ERROR)
+        return server_error_response(e)
+
+
+@manager.route('/get', methods=['GET'])  # noqa: F821
+@login_required
+def get():
+    chunk_id = request.args["chunk_id"]
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+        for tenant in tenants:
+            kb_ids = KnowledgebaseService.get_kb_ids(tenant.tenant_id)
+            chunk = settings.docStoreConn.get(chunk_id, search.index_name(tenant.tenant_id), kb_ids)
+            if chunk:
+                break
+        if chunk is None:
+            return server_error_response(Exception("Chunk not found"))
+
+        k = []
+        for n in chunk.keys():
+            if re.search(r"(_vec$|_sm_|_tks|_ltks)", n):
+                k.append(n)
+        for n in k:
+            del chunk[n]
+
+        return get_json_result(data=chunk)
+    except Exception as e:
+        if str(e).find("NotFoundError") >= 0:
+            return get_json_result(data=False, message='Chunk not found!',
+                                   code=settings.RetCode.DATA_ERROR)
+        return server_error_response(e)
+
+
+@manager.route('/set', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id", "chunk_id", "content_with_weight")
+def set():
+    req = request.json
+    d = {
+        "id": req["chunk_id"],
+        "content_with_weight": req["content_with_weight"]}
+    d["content_ltks"] = rag_tokenizer.tokenize(req["content_with_weight"])
+    d["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(d["content_ltks"])
+    if "important_kwd" in req:
+        d["important_kwd"] = req["important_kwd"]
+        d["important_tks"] = rag_tokenizer.tokenize(" ".join(req["important_kwd"]))
+    if "question_kwd" in req:
+        d["question_kwd"] = req["question_kwd"]
+        d["question_tks"] = rag_tokenizer.tokenize("\n".join(req["question_kwd"]))
+    if "tag_kwd" in req:
+        d["tag_kwd"] = req["tag_kwd"]
+    if "tag_feas" in req:
+        d["tag_feas"] = req["tag_feas"]
+    if "available_int" in req:
+        d["available_int"] = req["available_int"]
+
+    try:
+        tenant_id = DocumentService.get_tenant_id(req["doc_id"])
+        if not tenant_id:
+            return get_data_error_result(message="Tenant not found!")
+
+        embd_id = DocumentService.get_embd_id(req["doc_id"])
+        embd_mdl = LLMBundle(tenant_id, LLMType.EMBEDDING, embd_id)
+
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+
+        if doc.parser_id == ParserType.QA:
+            arr = [
+                t for t in re.split(
+                    r"[\n\t]",
+                    req["content_with_weight"]) if len(t) > 1]
+            q, a = rmPrefix(arr[0]), rmPrefix("\n".join(arr[1:]))
+            d = beAdoc(d, q, a, not any(
+                [rag_tokenizer.is_chinese(t) for t in q + a]))
+
+        v, c = embd_mdl.encode([doc.name, req["content_with_weight"] if not d.get("question_kwd") else "\n".join(d["question_kwd"])])
+        v = 0.1 * v[0] + 0.9 * v[1] if doc.parser_id != ParserType.QA else v[1]
+        d["q_%d_vec" % len(v)] = v.tolist()
+        settings.docStoreConn.update({"id": req["chunk_id"]}, d, search.index_name(tenant_id), doc.kb_id)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/switch', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("chunk_ids", "available_int", "doc_id")
+def switch():
+    req = request.json
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        for cid in req["chunk_ids"]:
+            if not settings.docStoreConn.update({"id": cid},
+                                                {"available_int": int(req["available_int"])},
+                                                search.index_name(DocumentService.get_tenant_id(req["doc_id"])),
+                                                doc.kb_id):
+                return get_data_error_result(message="Index updating failure")
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("chunk_ids", "doc_id")
+def rm():
+    req = request.json
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        if not settings.docStoreConn.delete({"id": req["chunk_ids"]}, search.index_name(current_user.id), doc.kb_id):
+            return get_data_error_result(message="Index updating failure")
+        deleted_chunk_ids = req["chunk_ids"]
+        chunk_number = len(deleted_chunk_ids)
+        DocumentService.decrement_chunk_num(doc.id, doc.kb_id, 1, chunk_number, 0)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/create', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id", "content_with_weight")
+def create():
+    req = request.json
+    chunck_id = xxhash.xxh64((req["content_with_weight"] + req["doc_id"]).encode("utf-8")).hexdigest()
+    d = {"id": chunck_id, "content_ltks": rag_tokenizer.tokenize(req["content_with_weight"]),
+         "content_with_weight": req["content_with_weight"]}
+    d["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(d["content_ltks"])
+    d["important_kwd"] = req.get("important_kwd", [])
+    d["important_tks"] = rag_tokenizer.tokenize(" ".join(req.get("important_kwd", [])))
+    d["question_kwd"] = req.get("question_kwd", [])
+    d["question_tks"] = rag_tokenizer.tokenize("\n".join(req.get("question_kwd", [])))
+    d["create_time"] = str(datetime.datetime.now()).replace("T", " ")[:19]
+    d["create_timestamp_flt"] = datetime.datetime.now().timestamp()
+
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        d["kb_id"] = [doc.kb_id]
+        d["docnm_kwd"] = doc.name
+        d["title_tks"] = rag_tokenizer.tokenize(doc.name)
+        d["doc_id"] = doc.id
+
+        tenant_id = DocumentService.get_tenant_id(req["doc_id"])
+        if not tenant_id:
+            return get_data_error_result(message="Tenant not found!")
+
+        e, kb = KnowledgebaseService.get_by_id(doc.kb_id)
+        if not e:
+            return get_data_error_result(message="Knowledgebase not found!")
+        if kb.pagerank:
+            d[PAGERANK_FLD] = kb.pagerank
+
+        embd_id = DocumentService.get_embd_id(req["doc_id"])
+        embd_mdl = LLMBundle(tenant_id, LLMType.EMBEDDING.value, embd_id)
+
+        v, c = embd_mdl.encode([doc.name, req["content_with_weight"] if not d["question_kwd"] else "\n".join(d["question_kwd"])])
+        v = 0.1 * v[0] + 0.9 * v[1]
+        d["q_%d_vec" % len(v)] = v.tolist()
+        settings.docStoreConn.insert([d], search.index_name(tenant_id), doc.kb_id)
+
+        DocumentService.increment_chunk_num(
+            doc.id, doc.kb_id, c, 1, 0)
+        return get_json_result(data={"chunk_id": chunck_id})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/retrieval_test', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("kb_id", "question")
+def retrieval_test():
+    req = request.json
+    page = int(req.get("page", 1))
+    size = int(req.get("size", 30))
+    question = req["question"]
+    kb_ids = req["kb_id"]
+    if isinstance(kb_ids, str):
+        kb_ids = [kb_ids]
+    doc_ids = req.get("doc_ids", [])
+    similarity_threshold = float(req.get("similarity_threshold", 0.0))
+    vector_similarity_weight = float(req.get("vector_similarity_weight", 0.3))
+    use_kg = req.get("use_kg", False)
+    top = int(req.get("top_k", 1024))
+    tenant_ids = []
+
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        for kb_id in kb_ids:
+            for tenant in tenants:
+                if KnowledgebaseService.query(
+                        tenant_id=tenant.tenant_id, id=kb_id):
+                    tenant_ids.append(tenant.tenant_id)
+                    break
+            else:
+                return get_json_result(
+                    data=False, message='Only owner of knowledgebase authorized for this operation.',
+                    code=settings.RetCode.OPERATING_ERROR)
+
+        e, kb = KnowledgebaseService.get_by_id(kb_ids[0])
+        if not e:
+            return get_data_error_result(message="Knowledgebase not found!")
+
+        embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING.value, llm_name=kb.embd_id)
+
+        rerank_mdl = None
+        if req.get("rerank_id"):
+            rerank_mdl = LLMBundle(kb.tenant_id, LLMType.RERANK.value, llm_name=req["rerank_id"])
+
+        if req.get("keyword", False):
+            chat_mdl = LLMBundle(kb.tenant_id, LLMType.CHAT)
+            question += keyword_extraction(chat_mdl, question)
+
+        labels = label_question(question, [kb])
+        ranks = settings.retrievaler.retrieval(question, embd_mdl, tenant_ids, kb_ids, page, size,
+                               similarity_threshold, vector_similarity_weight, top,
+                               doc_ids, rerank_mdl=rerank_mdl, highlight=req.get("highlight"),
+                               rank_feature=labels
+                               )
+        if use_kg:
+            ck = settings.kg_retrievaler.retrieval(question,
+                                                   tenant_ids,
+                                                   kb_ids,
+                                                   embd_mdl,
+                                                   LLMBundle(kb.tenant_id, LLMType.CHAT))
+            if ck["content_with_weight"]:
+                ranks["chunks"].insert(0, ck)
+
+        for c in ranks["chunks"]:
+            c.pop("vector", None)
+        ranks["labels"] = labels
+
+        return get_json_result(data=ranks)
+    except Exception as e:
+        if str(e).find("not_found") > 0:
+            return get_json_result(data=False, message='No chunk found! Check the chunk status please!',
+                                   code=settings.RetCode.DATA_ERROR)
+        return server_error_response(e)
+
+
+@manager.route('/knowledge_graph', methods=['GET'])  # noqa: F821
+@login_required
+def knowledge_graph():
+    doc_id = request.args["doc_id"]
+    tenant_id = DocumentService.get_tenant_id(doc_id)
+    kb_ids = KnowledgebaseService.get_kb_ids(tenant_id)
+    req = {
+        "doc_ids": [doc_id],
+        "knowledge_graph_kwd": ["graph", "mind_map"]
+    }
+    sres = settings.retrievaler.search(req, search.index_name(tenant_id), kb_ids)
+    obj = {"graph": {}, "mind_map": {}}
+    for id in sres.ids[:2]:
+        ty = sres.field[id]["knowledge_graph_kwd"]
+        try:
+            content_json = json.loads(sres.field[id]["content_with_weight"])
+        except Exception:
+            continue
+
+        if ty == 'mind_map':
+            node_dict = {}
+
+            def repeat_deal(content_json, node_dict):
+                if 'id' in content_json:
+                    if content_json['id'] in node_dict:
+                        node_name = content_json['id']
+                        content_json['id'] += f"({node_dict[content_json['id']]})"
+                        node_dict[node_name] += 1
+                    else:
+                        node_dict[content_json['id']] = 1
+                if 'children' in content_json and content_json['children']:
+                    for item in content_json['children']:
+                        repeat_deal(item, node_dict)
+
+            repeat_deal(content_json, node_dict)
+
+        obj[ty] = content_json
+
+    return get_json_result(data=obj)
--- a/api/apps/conversation_app.py
+++ b/api/apps/conversation_app.py
@ -0,0 +1,430 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+import re
+import traceback
+from copy import deepcopy
+from api.db.db_models import APIToken
+
+from api.db.services.conversation_service import ConversationService, structure_answer
+from api.db.services.user_service import UserTenantService
+from flask import request, Response
+from flask_login import login_required, current_user
+
+from api.db import LLMType
+from api.db.services.dialog_service import DialogService, chat, ask
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import LLMBundle, TenantService
+from api import settings
+from api.utils.api_utils import get_json_result
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
+from graphrag.general.mind_map_extractor import MindMapExtractor
+from rag.app.tag import label_question
+
+
+@manager.route('/set', methods=['POST'])  # noqa: F821
+@login_required
+def set_conversation():
+    req = request.json
+    conv_id = req.get("conversation_id")
+    is_new = req.get("is_new")
+    del req["is_new"]
+    if not is_new:
+        del req["conversation_id"]
+        try:
+            if not ConversationService.update_by_id(conv_id, req):
+                return get_data_error_result(message="Conversation not found!")
+            e, conv = ConversationService.get_by_id(conv_id)
+            if not e:
+                return get_data_error_result(
+                    message="Fail to update a conversation!")
+            conv = conv.to_dict()
+            return get_json_result(data=conv)
+        except Exception as e:
+            return server_error_response(e)
+
+    try:
+        e, dia = DialogService.get_by_id(req["dialog_id"])
+        if not e:
+            return get_data_error_result(message="Dialog not found")
+        conv = {
+            "id": conv_id,
+            "dialog_id": req["dialog_id"],
+            "name": req.get("name", "New conversation"),
+            "message": [{"role": "assistant", "content": dia.prompt_config["prologue"]}]
+        }
+        ConversationService.save(**conv)
+        return get_json_result(data=conv)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/get', methods=['GET'])  # noqa: F821
+@login_required
+def get():
+    conv_id = request.args["conversation_id"]
+    try:
+        
+        e, conv = ConversationService.get_by_id(conv_id)
+        if not e:
+            return get_data_error_result(message="Conversation not found!")
+        tenants = UserTenantService.query(user_id=current_user.id)
+        avatar =None
+        for tenant in tenants:
+            dialog = DialogService.query(tenant_id=tenant.tenant_id, id=conv.dialog_id)
+            if dialog and len(dialog)>0:
+                avatar = dialog[0].icon
+                break
+        else:
+            return get_json_result(
+                data=False, message='Only owner of conversation authorized for this operation.',
+                code=settings.RetCode.OPERATING_ERROR)
+
+        def get_value(d, k1, k2):
+            return d.get(k1, d.get(k2))
+
+        for ref in conv.reference:
+            if isinstance(ref, list):
+                continue
+            ref["chunks"] = [{
+                "id": get_value(ck, "chunk_id", "id"),
+                "content": get_value(ck, "content", "content_with_weight"),
+                "document_id": get_value(ck, "doc_id", "document_id"),
+                "document_name": get_value(ck, "docnm_kwd", "document_name"),
+                "dataset_id": get_value(ck, "kb_id", "dataset_id"),
+                "image_id": get_value(ck, "image_id", "img_id"),
+                "positions": get_value(ck, "positions", "position_int"),
+            } for ck in ref.get("chunks", [])]
+
+        conv = conv.to_dict()
+        conv["avatar"]=avatar
+        return get_json_result(data=conv)
+    except Exception as e:
+        return server_error_response(e)
+
+@manager.route('/getsse/<dialog_id>', methods=['GET'])  # type: ignore # noqa: F821
+def getsse(dialog_id):
+    
+    token = request.headers.get('Authorization').split()
+    if len(token) != 2:
+        return get_data_error_result(message='Authorization is not valid!"')
+    token = token[1]
+    objs = APIToken.query(beta=token)
+    if not objs:
+        return get_data_error_result(message='Authentication error: API key is invalid!"')
+    try:
+        e, conv = DialogService.get_by_id(dialog_id)
+        if not e:
+            return get_data_error_result(message="Dialog not found!")
+        conv = conv.to_dict()
+        conv["avatar"]= conv["icon"]
+        del conv["icon"]
+        return get_json_result(data=conv)
+    except Exception as e:
+        return server_error_response(e)
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@login_required
+def rm():
+    conv_ids = request.json["conversation_ids"]
+    try:
+        for cid in conv_ids:
+            exist, conv = ConversationService.get_by_id(cid)
+            if not exist:
+                return get_data_error_result(message="Conversation not found!")
+            tenants = UserTenantService.query(user_id=current_user.id)
+            for tenant in tenants:
+                if DialogService.query(tenant_id=tenant.tenant_id, id=conv.dialog_id):
+                    break
+            else:
+                return get_json_result(
+                    data=False, message='Only owner of conversation authorized for this operation.',
+                    code=settings.RetCode.OPERATING_ERROR)
+            ConversationService.delete_by_id(cid)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def list_convsersation():
+    dialog_id = request.args["dialog_id"]
+    try:
+        if not DialogService.query(tenant_id=current_user.id, id=dialog_id):
+            return get_json_result(
+                data=False, message='Only owner of dialog authorized for this operation.',
+                code=settings.RetCode.OPERATING_ERROR)
+        convs = ConversationService.query(
+            dialog_id=dialog_id,
+            order_by=ConversationService.model.create_time,
+            reverse=True)
+
+        convs = [d.to_dict() for d in convs]
+        return get_json_result(data=convs)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/completion', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("conversation_id", "messages")
+def completion():
+    req = request.json
+    msg = []
+    for m in req["messages"]:
+        if m["role"] == "system":
+            continue
+        if m["role"] == "assistant" and not msg:
+            continue
+        msg.append(m)
+    message_id = msg[-1].get("id")
+    try:
+        e, conv = ConversationService.get_by_id(req["conversation_id"])
+        if not e:
+            return get_data_error_result(message="Conversation not found!")
+        conv.message = deepcopy(req["messages"])
+        e, dia = DialogService.get_by_id(conv.dialog_id)
+        if not e:
+            return get_data_error_result(message="Dialog not found!")
+        del req["conversation_id"]
+        del req["messages"]
+
+        if not conv.reference:
+            conv.reference = []
+        else:
+            def get_value(d, k1, k2):
+                return d.get(k1, d.get(k2))
+
+            for ref in conv.reference:
+                if isinstance(ref, list):
+                    continue
+                ref["chunks"] = [{
+                    "id": get_value(ck, "chunk_id", "id"),
+                    "content": get_value(ck, "content", "content_with_weight"),
+                    "document_id": get_value(ck, "doc_id", "document_id"),
+                    "document_name": get_value(ck, "docnm_kwd", "document_name"),
+                    "dataset_id": get_value(ck, "kb_id", "dataset_id"),
+                    "image_id": get_value(ck, "image_id", "img_id"),
+                    "positions": get_value(ck, "positions", "position_int"),
+                } for ck in ref.get("chunks", [])]
+
+        if not conv.reference:
+            conv.reference = []
+        conv.reference.append({"chunks": [], "doc_aggs": []})
+        def stream():
+            nonlocal dia, msg, req, conv
+            try:
+                for ans in chat(dia, msg, True, **req):
+                    ans = structure_answer(conv, ans, message_id, conv.id)
+                    yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
+                ConversationService.update_by_id(conv.id, conv.to_dict())
+            except Exception as e:
+                traceback.print_exc()
+                yield "data:" + json.dumps({"code": 500, "message": str(e),
+                                            "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
+                                           ensure_ascii=False) + "\n\n"
+            yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
+
+        if req.get("stream", True):
+            resp = Response(stream(), mimetype="text/event-stream")
+            resp.headers.add_header("Cache-control", "no-cache")
+            resp.headers.add_header("Connection", "keep-alive")
+            resp.headers.add_header("X-Accel-Buffering", "no")
+            resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+            return resp
+
+        else:
+            answer = None
+            for ans in chat(dia, msg, **req):
+                answer = structure_answer(conv, ans, message_id, req["conversation_id"])
+                ConversationService.update_by_id(conv.id, conv.to_dict())
+                break
+            return get_json_result(data=answer)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/tts', methods=['POST'])  # noqa: F821
+@login_required
+def tts():
+    req = request.json
+    text = req["text"]
+
+    tenants = TenantService.get_info_by(current_user.id)
+    if not tenants:
+        return get_data_error_result(message="Tenant not found!")
+
+    tts_id = tenants[0]["tts_id"]
+    if not tts_id:
+        return get_data_error_result(message="No default TTS model is set")
+
+    tts_mdl = LLMBundle(tenants[0]["tenant_id"], LLMType.TTS, tts_id)
+
+    def stream_audio():
+        try:
+            for txt in re.split(r"[，。/《》？；：！\n\r:;]+", text):
+                for chunk in tts_mdl.tts(txt):
+                    yield chunk
+        except Exception as e:
+            yield ("data:" + json.dumps({"code": 500, "message": str(e),
+                                         "data": {"answer": "**ERROR**: " + str(e)}},
+                                        ensure_ascii=False)).encode('utf-8')
+
+    resp = Response(stream_audio(), mimetype="audio/mpeg")
+    resp.headers.add_header("Cache-Control", "no-cache")
+    resp.headers.add_header("Connection", "keep-alive")
+    resp.headers.add_header("X-Accel-Buffering", "no")
+
+    return resp
+
+
+@manager.route('/delete_msg', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("conversation_id", "message_id")
+def delete_msg():
+    req = request.json
+    e, conv = ConversationService.get_by_id(req["conversation_id"])
+    if not e:
+        return get_data_error_result(message="Conversation not found!")
+
+    conv = conv.to_dict()
+    for i, msg in enumerate(conv["message"]):
+        if req["message_id"] != msg.get("id", ""):
+            continue
+        assert conv["message"][i + 1]["id"] == req["message_id"]
+        conv["message"].pop(i)
+        conv["message"].pop(i)
+        conv["reference"].pop(max(0, i // 2 - 1))
+        break
+
+    ConversationService.update_by_id(conv["id"], conv)
+    return get_json_result(data=conv)
+
+
+@manager.route('/thumbup', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("conversation_id", "message_id")
+def thumbup():
+    req = request.json
+    e, conv = ConversationService.get_by_id(req["conversation_id"])
+    if not e:
+        return get_data_error_result(message="Conversation not found!")
+    up_down = req.get("set")
+    feedback = req.get("feedback", "")
+    conv = conv.to_dict()
+    for i, msg in enumerate(conv["message"]):
+        if req["message_id"] == msg.get("id", "") and msg.get("role", "") == "assistant":
+            if up_down:
+                msg["thumbup"] = True
+                if "feedback" in msg:
+                    del msg["feedback"]
+            else:
+                msg["thumbup"] = False
+                if feedback:
+                    msg["feedback"] = feedback
+            break
+
+    ConversationService.update_by_id(conv["id"], conv)
+    return get_json_result(data=conv)
+
+
+@manager.route('/ask', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("question", "kb_ids")
+def ask_about():
+    req = request.json
+    uid = current_user.id
+
+    def stream():
+        nonlocal req, uid
+        try:
+            for ans in ask(req["question"], req["kb_ids"], uid):
+                yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
+        except Exception as e:
+            yield "data:" + json.dumps({"code": 500, "message": str(e),
+                                        "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
+                                       ensure_ascii=False) + "\n\n"
+        yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
+
+    resp = Response(stream(), mimetype="text/event-stream")
+    resp.headers.add_header("Cache-control", "no-cache")
+    resp.headers.add_header("Connection", "keep-alive")
+    resp.headers.add_header("X-Accel-Buffering", "no")
+    resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+    return resp
+
+
+@manager.route('/mindmap', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("question", "kb_ids")
+def mindmap():
+    req = request.json
+    kb_ids = req["kb_ids"]
+    e, kb = KnowledgebaseService.get_by_id(kb_ids[0])
+    if not e:
+        return get_data_error_result(message="Knowledgebase not found!")
+
+    embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING, llm_name=kb.embd_id)
+    chat_mdl = LLMBundle(current_user.id, LLMType.CHAT)
+    question = req["question"]
+    ranks = settings.retrievaler.retrieval(question, embd_mdl, kb.tenant_id, kb_ids, 1, 12,
+                                           0.3, 0.3, aggs=False,
+                                           rank_feature=label_question(question, [kb])
+                                           )
+    mindmap = MindMapExtractor(chat_mdl)
+    mind_map = mindmap([c["content_with_weight"] for c in ranks["chunks"]]).output
+    if "error" in mind_map:
+        return server_error_response(Exception(mind_map["error"]))
+    return get_json_result(data=mind_map)
+
+
+@manager.route('/related_questions', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("question")
+def related_questions():
+    req = request.json
+    question = req["question"]
+    chat_mdl = LLMBundle(current_user.id, LLMType.CHAT)
+    prompt = """
+Objective: To generate search terms related to the user's search keywords, helping users find more valuable information.
+Instructions:
+ - Based on the keywords provided by the user, generate 5-10 related search terms.
+ - Each search term should be directly or indirectly related to the keyword, guiding the user to find more valuable information.
+ - Use common, general terms as much as possible, avoiding obscure words or technical jargon.
+ - Keep the term length between 2-4 words, concise and clear.
+ - DO NOT translate, use the language of the original keywords.
+
+### Example:
+Keywords: Chinese football
+Related search terms:
+1. Current status of Chinese football
+2. Reform of Chinese football
+3. Youth training of Chinese football
+4. Chinese football in the Asian Cup
+5. Chinese football in the World Cup
+
+Reason:
+ - When searching, users often only use one or two keywords, making it difficult to fully express their information needs.
+ - Generating related search terms can help users dig deeper into relevant information and improve search efficiency. 
+ - At the same time, related terms can also help search engines better understand user needs and return more accurate search results.
+ 
+"""
+    ans = chat_mdl.chat(prompt, [{"role": "user", "content": f"""
+Keywords: {question}
+Related search terms:
+    """}], {"temperature": 0.9})
+    return get_json_result(data=[re.sub(r"^[0-9]\. ", "", a) for a in ans.split("\n") if re.match(r"^[0-9]\. ", a)])
--- a/api/apps/dialog_app.py
+++ b/api/apps/dialog_app.py
@ -0,0 +1,180 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from flask import request
+from flask_login import login_required, current_user
+from api.db.services.dialog_service import DialogService
+from api.db import StatusEnum
+from api.db.services.llm_service import TenantLLMService
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.user_service import TenantService, UserTenantService
+from api import settings
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
+from api.utils import get_uuid
+from api.utils.api_utils import get_json_result
+
+
+@manager.route('/set', methods=['POST'])  # noqa: F821
+@login_required
+def set_dialog():
+    req = request.json
+    dialog_id = req.get("dialog_id")
+    name = req.get("name", "New Dialog")
+    description = req.get("description", "A helpful dialog")
+    icon = req.get("icon", "")
+    top_n = req.get("top_n", 6)
+    top_k = req.get("top_k", 1024)
+    rerank_id = req.get("rerank_id", "")
+    if not rerank_id:
+        req["rerank_id"] = ""
+    similarity_threshold = req.get("similarity_threshold", 0.1)
+    vector_similarity_weight = req.get("vector_similarity_weight", 0.3)
+    llm_setting = req.get("llm_setting", {})
+    default_prompt = {
+        "system": """你是一个智能助手，请总结知识库的内容来回答问题，请列举知识库中的数据详细回答。当所有知识库内容都与问题无关时，你的回答必须包括“知识库中未找到您要的答案！”这句话。回答需要考虑聊天历史。
+以下是知识库：
+{knowledge}
+以上是知识库。""",
+        "prologue": "您好，我是您的助手小樱，长得可爱又善良，can I help you?",
+        "parameters": [
+            {"key": "knowledge", "optional": False}
+        ],
+        "empty_response": "Sorry! 知识库中未找到相关内容！"
+    }
+    prompt_config = req.get("prompt_config", default_prompt)
+
+    if not prompt_config["system"]:
+        prompt_config["system"] = default_prompt["system"]
+
+    for p in prompt_config["parameters"]:
+        if p["optional"]:
+            continue
+        if prompt_config["system"].find("{%s}" % p["key"]) < 0:
+            return get_data_error_result(
+                message="Parameter '{}' is not used".format(p["key"]))
+
+    try:
+        e, tenant = TenantService.get_by_id(current_user.id)
+        if not e:
+            return get_data_error_result(message="Tenant not found!")
+        kbs = KnowledgebaseService.get_by_ids(req.get("kb_ids", []))
+        embd_ids = [TenantLLMService.split_model_name_and_factory(kb.embd_id)[0] for kb in kbs]  # remove vendor suffix for comparison
+        embd_count = len(set(embd_ids))
+        if embd_count > 1:
+            return get_data_error_result(message=f'Datasets use different embedding models: {[kb.embd_id for kb in kbs]}"')
+
+        llm_id = req.get("llm_id", tenant.llm_id)
+        if not dialog_id:
+            dia = {
+                "id": get_uuid(),
+                "tenant_id": current_user.id,
+                "name": name,
+                "kb_ids": req.get("kb_ids", []),
+                "description": description,
+                "llm_id": llm_id,
+                "llm_setting": llm_setting,
+                "prompt_config": prompt_config,
+                "top_n": top_n,
+                "top_k": top_k,
+                "rerank_id": rerank_id,
+                "similarity_threshold": similarity_threshold,
+                "vector_similarity_weight": vector_similarity_weight,
+                "icon": icon
+            }
+            if not DialogService.save(**dia):
+                return get_data_error_result(message="Fail to new a dialog!")
+            return get_json_result(data=dia)
+        else:
+            del req["dialog_id"]
+            if "kb_names" in req:
+                del req["kb_names"]
+            if not DialogService.update_by_id(dialog_id, req):
+                return get_data_error_result(message="Dialog not found!")
+            e, dia = DialogService.get_by_id(dialog_id)
+            if not e:
+                return get_data_error_result(message="Fail to update a dialog!")
+            dia = dia.to_dict()
+            dia.update(req)
+            dia["kb_ids"], dia["kb_names"] = get_kb_names(dia["kb_ids"])
+            return get_json_result(data=dia)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/get', methods=['GET'])  # noqa: F821
+@login_required
+def get():
+    dialog_id = request.args["dialog_id"]
+    try:
+        e, dia = DialogService.get_by_id(dialog_id)
+        if not e:
+            return get_data_error_result(message="Dialog not found!")
+        dia = dia.to_dict()
+        dia["kb_ids"], dia["kb_names"] = get_kb_names(dia["kb_ids"])
+        return get_json_result(data=dia)
+    except Exception as e:
+        return server_error_response(e)
+
+
+def get_kb_names(kb_ids):
+    ids, nms = [], []
+    for kid in kb_ids:
+        e, kb = KnowledgebaseService.get_by_id(kid)
+        if not e or kb.status != StatusEnum.VALID.value:
+            continue
+        ids.append(kid)
+        nms.append(kb.name)
+    return ids, nms
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def list_dialogs():
+    try:
+        diags = DialogService.query(
+            tenant_id=current_user.id,
+            status=StatusEnum.VALID.value,
+            reverse=True,
+            order_by=DialogService.model.create_time)
+        diags = [d.to_dict() for d in diags]
+        for d in diags:
+            d["kb_ids"], d["kb_names"] = get_kb_names(d["kb_ids"])
+        return get_json_result(data=diags)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("dialog_ids")
+def rm():
+    req = request.json
+    dialog_list=[]
+    tenants = UserTenantService.query(user_id=current_user.id)
+    try:
+        for id in req["dialog_ids"]:
+            for tenant in tenants:
+                if DialogService.query(tenant_id=tenant.tenant_id, id=id):
+                    break
+            else:
+                return get_json_result(
+                    data=False, message='Only owner of dialog authorized for this operation.',
+                    code=settings.RetCode.OPERATING_ERROR)
+            dialog_list.append({"id": id,"status":StatusEnum.INVALID.value})
+        DialogService.update_many_by_id(dialog_list)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
--- a/api/apps/document_app.py
+++ b/api/apps/document_app.py
@ -0,0 +1,631 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License
+#
+import json
+import os.path
+import pathlib
+import re
+
+import flask
+from flask import request
+from flask_login import login_required, current_user
+
+from deepdoc.parser.html_parser import RAGFlowHtmlParser
+from rag.nlp import search
+
+from api.db import FileType, TaskStatus, ParserType, FileSource
+from api.db.db_models import File, Task
+from api.db.services.file2document_service import File2DocumentService
+from api.db.services.file_service import FileService
+from api.db.services.task_service import queue_tasks
+from api.db.services.user_service import UserTenantService
+from api.db.services import duplicate_name
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.task_service import TaskService
+from api.db.services.document_service import DocumentService, doc_upload_and_parse
+from api.utils.api_utils import (
+    server_error_response,
+    get_data_error_result,
+    validate_request,
+)
+from api.utils import get_uuid
+from api import settings
+from api.utils.api_utils import get_json_result
+from rag.utils.storage_factory import STORAGE_IMPL
+from api.utils.file_utils import filename_type, thumbnail, get_project_base_directory
+from api.utils.web_utils import html2pdf, is_valid_url
+from api.constants import IMG_BASE64_PREFIX
+
+
+@manager.route('/upload', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("kb_id")
+def upload():
+    kb_id = request.form.get("kb_id")
+    if not kb_id:
+        return get_json_result(
+            data=False, message='Lack of "KB ID"', code=settings.RetCode.ARGUMENT_ERROR)
+    if 'file' not in request.files:
+        return get_json_result(
+            data=False, message='No file part!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    file_objs = request.files.getlist('file')
+    for file_obj in file_objs:
+        if file_obj.filename == '':
+            return get_json_result(
+                data=False, message='No file selected!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    e, kb = KnowledgebaseService.get_by_id(kb_id)
+    if not e:
+        raise LookupError("Can't find this knowledgebase!")
+
+    err, _ = FileService.upload_document(kb, file_objs, current_user.id)
+    if err:
+        return get_json_result(
+            data=False, message="\n".join(err), code=settings.RetCode.SERVER_ERROR)
+    return get_json_result(data=True)
+
+
+@manager.route('/web_crawl', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("kb_id", "name", "url")
+def web_crawl():
+    kb_id = request.form.get("kb_id")
+    if not kb_id:
+        return get_json_result(
+            data=False, message='Lack of "KB ID"', code=settings.RetCode.ARGUMENT_ERROR)
+    name = request.form.get("name")
+    url = request.form.get("url")
+    if not is_valid_url(url):
+        return get_json_result(
+            data=False, message='The URL format is invalid', code=settings.RetCode.ARGUMENT_ERROR)
+    e, kb = KnowledgebaseService.get_by_id(kb_id)
+    if not e:
+        raise LookupError("Can't find this knowledgebase!")
+
+    blob = html2pdf(url)
+    if not blob:
+        return server_error_response(ValueError("Download failure."))
+
+    root_folder = FileService.get_root_folder(current_user.id)
+    pf_id = root_folder["id"]
+    FileService.init_knowledgebase_docs(pf_id, current_user.id)
+    kb_root_folder = FileService.get_kb_folder(current_user.id)
+    kb_folder = FileService.new_a_file_from_kb(kb.tenant_id, kb.name, kb_root_folder["id"])
+
+    try:
+        filename = duplicate_name(
+            DocumentService.query,
+            name=name + ".pdf",
+            kb_id=kb.id)
+        filetype = filename_type(filename)
+        if filetype == FileType.OTHER.value:
+            raise RuntimeError("This type of file has not been supported yet!")
+
+        location = filename
+        while STORAGE_IMPL.obj_exist(kb_id, location):
+            location += "_"
+        STORAGE_IMPL.put(kb_id, location, blob)
+        doc = {
+            "id": get_uuid(),
+            "kb_id": kb.id,
+            "parser_id": kb.parser_id,
+            "parser_config": kb.parser_config,
+            "created_by": current_user.id,
+            "type": filetype,
+            "name": filename,
+            "location": location,
+            "size": len(blob),
+            "thumbnail": thumbnail(filename, blob)
+        }
+        if doc["type"] == FileType.VISUAL:
+            doc["parser_id"] = ParserType.PICTURE.value
+        if doc["type"] == FileType.AURAL:
+            doc["parser_id"] = ParserType.AUDIO.value
+        if re.search(r"\.(ppt|pptx|pages)$", filename):
+            doc["parser_id"] = ParserType.PRESENTATION.value
+        if re.search(r"\.(eml)$", filename):
+            doc["parser_id"] = ParserType.EMAIL.value
+        DocumentService.insert(doc)
+        FileService.add_file_from_kb(doc, kb_folder["id"], kb.tenant_id)
+    except Exception as e:
+        return server_error_response(e)
+    return get_json_result(data=True)
+
+
+@manager.route('/create', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("name", "kb_id")
+def create():
+    req = request.json
+    kb_id = req["kb_id"]
+    if not kb_id:
+        return get_json_result(
+            data=False, message='Lack of "KB ID"', code=settings.RetCode.ARGUMENT_ERROR)
+
+    try:
+        e, kb = KnowledgebaseService.get_by_id(kb_id)
+        if not e:
+            return get_data_error_result(
+                message="Can't find this knowledgebase!")
+
+        if DocumentService.query(name=req["name"], kb_id=kb_id):
+            return get_data_error_result(
+                message="Duplicated document name in the same knowledgebase.")
+
+        doc = DocumentService.insert({
+            "id": get_uuid(),
+            "kb_id": kb.id,
+            "parser_id": kb.parser_id,
+            "parser_config": kb.parser_config,
+            "created_by": current_user.id,
+            "type": FileType.VIRTUAL,
+            "name": req["name"],
+            "location": "",
+            "size": 0
+        })
+        return get_json_result(data=doc.to_json())
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def list_docs():
+    kb_id = request.args.get("kb_id")
+    if not kb_id:
+        return get_json_result(
+            data=False, message='Lack of "KB ID"', code=settings.RetCode.ARGUMENT_ERROR)
+    tenants = UserTenantService.query(user_id=current_user.id)
+    for tenant in tenants:
+        if KnowledgebaseService.query(
+                tenant_id=tenant.tenant_id, id=kb_id):
+            break
+    else:
+        return get_json_result(
+            data=False, message='Only owner of knowledgebase authorized for this operation.',
+            code=settings.RetCode.OPERATING_ERROR)
+    keywords = request.args.get("keywords", "")
+
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 15))
+    orderby = request.args.get("orderby", "create_time")
+    desc = request.args.get("desc", True)
+    try:
+        docs, tol = DocumentService.get_by_kb_id(
+            kb_id, page_number, items_per_page, orderby, desc, keywords)
+
+        for doc_item in docs:
+            if doc_item['thumbnail'] and not doc_item['thumbnail'].startswith(IMG_BASE64_PREFIX):
+                doc_item['thumbnail'] = f"/v1/document/image/{kb_id}-{doc_item['thumbnail']}"
+
+        return get_json_result(data={"total": tol, "docs": docs})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/infos', methods=['POST'])  # noqa: F821
+@login_required
+def docinfos():
+    req = request.json
+    doc_ids = req["doc_ids"]
+    for doc_id in doc_ids:
+        if not DocumentService.accessible(doc_id, current_user.id):
+            return get_json_result(
+                data=False,
+                message='No authorization.',
+                code=settings.RetCode.AUTHENTICATION_ERROR
+            )
+    docs = DocumentService.get_by_ids(doc_ids)
+    return get_json_result(data=list(docs.dicts()))
+
+
+@manager.route('/thumbnails', methods=['GET'])  # noqa: F821
+# @login_required
+def thumbnails():
+    doc_ids = request.args.get("doc_ids").split(",")
+    if not doc_ids:
+        return get_json_result(
+            data=False, message='Lack of "Document ID"', code=settings.RetCode.ARGUMENT_ERROR)
+
+    try:
+        docs = DocumentService.get_thumbnails(doc_ids)
+
+        for doc_item in docs:
+            if doc_item['thumbnail'] and not doc_item['thumbnail'].startswith(IMG_BASE64_PREFIX):
+                doc_item['thumbnail'] = f"/v1/document/image/{doc_item['kb_id']}-{doc_item['thumbnail']}"
+
+        return get_json_result(data={d["id"]: d["thumbnail"] for d in docs})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/change_status', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id", "status")
+def change_status():
+    req = request.json
+    if str(req["status"]) not in ["0", "1"]:
+        return get_json_result(
+            data=False,
+            message='"Status" must be either 0 or 1!',
+            code=settings.RetCode.ARGUMENT_ERROR)
+
+    if not DocumentService.accessible(req["doc_id"], current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        e, kb = KnowledgebaseService.get_by_id(doc.kb_id)
+        if not e:
+            return get_data_error_result(
+                message="Can't find this knowledgebase!")
+
+        if not DocumentService.update_by_id(
+                req["doc_id"], {"status": str(req["status"])}):
+            return get_data_error_result(
+                message="Database error (Document update)!")
+
+        status = int(req["status"])
+        settings.docStoreConn.update({"doc_id": req["doc_id"]}, {"available_int": status},
+                                     search.index_name(kb.tenant_id), doc.kb_id)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id")
+def rm():
+    req = request.json
+    doc_ids = req["doc_id"]
+    if isinstance(doc_ids, str):
+        doc_ids = [doc_ids]
+
+    for doc_id in doc_ids:
+        if not DocumentService.accessible4deletion(doc_id, current_user.id):
+            return get_json_result(
+                data=False,
+                message='No authorization.',
+                code=settings.RetCode.AUTHENTICATION_ERROR
+            )
+
+    root_folder = FileService.get_root_folder(current_user.id)
+    pf_id = root_folder["id"]
+    FileService.init_knowledgebase_docs(pf_id, current_user.id)
+    errors = ""
+    for doc_id in doc_ids:
+        try:
+            e, doc = DocumentService.get_by_id(doc_id)
+            if not e:
+                return get_data_error_result(message="Document not found!")
+            tenant_id = DocumentService.get_tenant_id(doc_id)
+            if not tenant_id:
+                return get_data_error_result(message="Tenant not found!")
+
+            b, n = File2DocumentService.get_storage_address(doc_id=doc_id)
+
+            TaskService.filter_delete([Task.doc_id == doc_id])
+            if not DocumentService.remove_document(doc, tenant_id):
+                return get_data_error_result(
+                    message="Database error (Document removal)!")
+
+            f2d = File2DocumentService.get_by_document_id(doc_id)
+            FileService.filter_delete([File.source_type == FileSource.KNOWLEDGEBASE, File.id == f2d[0].file_id])
+            File2DocumentService.delete_by_document_id(doc_id)
+
+            STORAGE_IMPL.rm(b, n)
+        except Exception as e:
+            errors += str(e)
+
+    if errors:
+        return get_json_result(data=False, message=errors, code=settings.RetCode.SERVER_ERROR)
+
+    return get_json_result(data=True)
+
+
+@manager.route('/run', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_ids", "run")
+def run():
+    req = request.json
+    for doc_id in req["doc_ids"]:
+        if not DocumentService.accessible(doc_id, current_user.id):
+            return get_json_result(
+                data=False,
+                message='No authorization.',
+                code=settings.RetCode.AUTHENTICATION_ERROR
+            )
+    try:
+        for id in req["doc_ids"]:
+            info = {"run": str(req["run"]), "progress": 0}
+            if str(req["run"]) == TaskStatus.RUNNING.value and req.get("delete", False):
+                info["progress_msg"] = ""
+                info["chunk_num"] = 0
+                info["token_num"] = 0
+            DocumentService.update_by_id(id, info)
+            tenant_id = DocumentService.get_tenant_id(id)
+            if not tenant_id:
+                return get_data_error_result(message="Tenant not found!")
+            e, doc = DocumentService.get_by_id(id)
+            if not e:
+                return get_data_error_result(message="Document not found!")
+            if req.get("delete", False):
+                TaskService.filter_delete([Task.doc_id == id])
+                if settings.docStoreConn.indexExist(search.index_name(tenant_id), doc.kb_id):
+                    settings.docStoreConn.delete({"doc_id": id}, search.index_name(tenant_id), doc.kb_id)
+
+            if str(req["run"]) == TaskStatus.RUNNING.value:
+                e, doc = DocumentService.get_by_id(id)
+                doc = doc.to_dict()
+                doc["tenant_id"] = tenant_id
+                bucket, name = File2DocumentService.get_storage_address(doc_id=doc["id"])
+                queue_tasks(doc, bucket, name)
+
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rename', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id", "name")
+def rename():
+    req = request.json
+    if not DocumentService.accessible(req["doc_id"], current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        if pathlib.Path(req["name"].lower()).suffix != pathlib.Path(
+                doc.name.lower()).suffix:
+            return get_json_result(
+                data=False,
+                message="The extension of file can't be changed",
+                code=settings.RetCode.ARGUMENT_ERROR)
+        for d in DocumentService.query(name=req["name"], kb_id=doc.kb_id):
+            if d.name == req["name"]:
+                return get_data_error_result(
+                    message="Duplicated document name in the same knowledgebase.")
+
+        if not DocumentService.update_by_id(
+                req["doc_id"], {"name": req["name"]}):
+            return get_data_error_result(
+                message="Database error (Document rename)!")
+
+        informs = File2DocumentService.get_by_document_id(req["doc_id"])
+        if informs:
+            e, file = FileService.get_by_id(informs[0].file_id)
+            FileService.update_by_id(file.id, {"name": req["name"]})
+
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/get/<doc_id>', methods=['GET'])  # noqa: F821
+# @login_required
+def get(doc_id):
+    try:
+        e, doc = DocumentService.get_by_id(doc_id)
+        if not e:
+            return get_data_error_result(message="Document not found!")
+
+        b, n = File2DocumentService.get_storage_address(doc_id=doc_id)
+        response = flask.make_response(STORAGE_IMPL.get(b, n))
+
+        ext = re.search(r"\.([^.]+)$", doc.name)
+        if ext:
+            if doc.type == FileType.VISUAL.value:
+                response.headers.set('Content-Type', 'image/%s' % ext.group(1))
+            else:
+                response.headers.set(
+                    'Content-Type',
+                    'application/%s' %
+                    ext.group(1))
+        return response
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/change_parser', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id", "parser_id")
+def change_parser():
+    req = request.json
+
+    if not DocumentService.accessible(req["doc_id"], current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        if doc.parser_id.lower() == req["parser_id"].lower():
+            if "parser_config" in req:
+                if req["parser_config"] == doc.parser_config:
+                    return get_json_result(data=True)
+            else:
+                return get_json_result(data=True)
+
+        if ((doc.type == FileType.VISUAL and req["parser_id"] != "picture")
+                or (re.search(
+                    r"\.(ppt|pptx|pages)$", doc.name) and req["parser_id"] != "presentation")):
+            return get_data_error_result(message="Not supported yet!")
+
+        e = DocumentService.update_by_id(doc.id,
+                                         {"parser_id": req["parser_id"], "progress": 0, "progress_msg": "",
+                                          "run": TaskStatus.UNSTART.value})
+        if not e:
+            return get_data_error_result(message="Document not found!")
+        if "parser_config" in req:
+            DocumentService.update_parser_config(doc.id, req["parser_config"])
+        if doc.token_num > 0:
+            e = DocumentService.increment_chunk_num(doc.id, doc.kb_id, doc.token_num * -1, doc.chunk_num * -1,
+                                                    doc.process_duation * -1)
+            if not e:
+                return get_data_error_result(message="Document not found!")
+            tenant_id = DocumentService.get_tenant_id(req["doc_id"])
+            if not tenant_id:
+                return get_data_error_result(message="Tenant not found!")
+            if settings.docStoreConn.indexExist(search.index_name(tenant_id), doc.kb_id):
+                settings.docStoreConn.delete({"doc_id": doc.id}, search.index_name(tenant_id), doc.kb_id)
+
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/image/<image_id>', methods=['GET'])  # noqa: F821
+# @login_required
+def get_image(image_id):
+    try:
+        arr = image_id.split("-")
+        if len(arr) != 2:
+            return get_data_error_result(message="Image not found.")
+        bkt, nm = image_id.split("-")
+        response = flask.make_response(STORAGE_IMPL.get(bkt, nm))
+        response.headers.set('Content-Type', 'image/JPEG')
+        return response
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/upload_and_parse', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("conversation_id")
+def upload_and_parse():
+    if 'file' not in request.files:
+        return get_json_result(
+            data=False, message='No file part!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    file_objs = request.files.getlist('file')
+    for file_obj in file_objs:
+        if file_obj.filename == '':
+            return get_json_result(
+                data=False, message='No file selected!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    doc_ids = doc_upload_and_parse(request.form.get("conversation_id"), file_objs, current_user.id)
+
+    return get_json_result(data=doc_ids)
+
+
+@manager.route('/parse', methods=['POST'])  # noqa: F821
+@login_required
+def parse():
+    url = request.json.get("url") if request.json else ""
+    if url:
+        if not is_valid_url(url):
+            return get_json_result(
+                data=False, message='The URL format is invalid', code=settings.RetCode.ARGUMENT_ERROR)
+        download_path = os.path.join(get_project_base_directory(), "logs/downloads")
+        os.makedirs(download_path, exist_ok=True)
+        from seleniumwire.webdriver import Chrome, ChromeOptions
+        options = ChromeOptions()
+        options.add_argument('--headless')
+        options.add_argument('--disable-gpu')
+        options.add_argument('--no-sandbox')
+        options.add_argument('--disable-dev-shm-usage')
+        options.add_experimental_option('prefs', {
+            'download.default_directory': download_path,
+            'download.prompt_for_download': False,
+            'download.directory_upgrade': True,
+            'safebrowsing.enabled': True
+        })
+        driver = Chrome(options=options)
+        driver.get(url)
+        res_headers = [r.response.headers for r in driver.requests if r and r.response]
+        if len(res_headers) > 1:
+            sections = RAGFlowHtmlParser().parser_txt(driver.page_source)
+            driver.quit()
+            return get_json_result(data="\n".join(sections))
+
+        class File:
+            filename: str
+            filepath: str
+
+            def __init__(self, filename, filepath):
+                self.filename = filename
+                self.filepath = filepath
+
+            def read(self):
+                with open(self.filepath, "rb") as f:
+                    return f.read()
+
+        r = re.search(r"filename=\"([^\"]+)\"", str(res_headers))
+        if not r or not r.group(1):
+            return get_json_result(
+                data=False, message="Can't not identify downloaded file", code=settings.RetCode.ARGUMENT_ERROR)
+        f = File(r.group(1), os.path.join(download_path, r.group(1)))
+        txt = FileService.parse_docs([f], current_user.id)
+        return get_json_result(data=txt)
+
+    if 'file' not in request.files:
+        return get_json_result(
+            data=False, message='No file part!', code=settings.RetCode.ARGUMENT_ERROR)
+
+    file_objs = request.files.getlist('file')
+    txt = FileService.parse_docs(file_objs, current_user.id)
+
+    return get_json_result(data=txt)
+
+
+@manager.route('/set_meta', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("doc_id", "meta")
+def set_meta():
+    req = request.json
+    if not DocumentService.accessible(req["doc_id"], current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    try:
+        meta = json.loads(req["meta"])
+    except Exception as e:
+        return get_json_result(
+            data=False, message=f'Json syntax error: {e}', code=settings.RetCode.ARGUMENT_ERROR)
+    if not isinstance(meta, dict):
+        return get_json_result(
+            data=False, message='Meta data should be in Json map format, like {"key": "value"}', code=settings.RetCode.ARGUMENT_ERROR)
+
+    try:
+        e, doc = DocumentService.get_by_id(req["doc_id"])
+        if not e:
+            return get_data_error_result(message="Document not found!")
+
+        if not DocumentService.update_by_id(
+                req["doc_id"], {"meta_fields": meta}):
+            return get_data_error_result(
+                message="Database error (meta updates)!")
+
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
--- a/api/apps/file2document_app.py
+++ b/api/apps/file2document_app.py
@ -0,0 +1,125 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License
+#
+
+from api.db.services.file2document_service import File2DocumentService
+from api.db.services.file_service import FileService
+
+from flask import request
+from flask_login import login_required, current_user
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
+from api.utils import get_uuid
+from api.db import FileType
+from api.db.services.document_service import DocumentService
+from api import settings
+from api.utils.api_utils import get_json_result
+
+
+@manager.route('/convert', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("file_ids", "kb_ids")
+def convert():
+    req = request.json
+    kb_ids = req["kb_ids"]
+    file_ids = req["file_ids"]
+    file2documents = []
+
+    try:
+        for file_id in file_ids:
+            e, file = FileService.get_by_id(file_id)
+            file_ids_list = [file_id]
+            if file.type == FileType.FOLDER.value:
+                file_ids_list = FileService.get_all_innermost_file_ids(file_id, [])
+            for id in file_ids_list:
+                informs = File2DocumentService.get_by_file_id(id)
+                # delete
+                for inform in informs:
+                    doc_id = inform.document_id
+                    e, doc = DocumentService.get_by_id(doc_id)
+                    if not e:
+                        return get_data_error_result(message="Document not found!")
+                    tenant_id = DocumentService.get_tenant_id(doc_id)
+                    if not tenant_id:
+                        return get_data_error_result(message="Tenant not found!")
+                    if not DocumentService.remove_document(doc, tenant_id):
+                        return get_data_error_result(
+                            message="Database error (Document removal)!")
+                File2DocumentService.delete_by_file_id(id)
+
+                # insert
+                for kb_id in kb_ids:
+                    e, kb = KnowledgebaseService.get_by_id(kb_id)
+                    if not e:
+                        return get_data_error_result(
+                            message="Can't find this knowledgebase!")
+                    e, file = FileService.get_by_id(id)
+                    if not e:
+                        return get_data_error_result(
+                            message="Can't find this file!")
+
+                    doc = DocumentService.insert({
+                        "id": get_uuid(),
+                        "kb_id": kb.id,
+                        "parser_id": FileService.get_parser(file.type, file.name, kb.parser_id),
+                        "parser_config": kb.parser_config,
+                        "created_by": current_user.id,
+                        "type": file.type,
+                        "name": file.name,
+                        "location": file.location,
+                        "size": file.size
+                    })
+                    file2document = File2DocumentService.insert({
+                        "id": get_uuid(),
+                        "file_id": id,
+                        "document_id": doc.id,
+                    })
+                    file2documents.append(file2document.to_json())
+        return get_json_result(data=file2documents)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("file_ids")
+def rm():
+    req = request.json
+    file_ids = req["file_ids"]
+    if not file_ids:
+        return get_json_result(
+            data=False, message='Lack of "Files ID"', code=settings.RetCode.ARGUMENT_ERROR)
+    try:
+        for file_id in file_ids:
+            informs = File2DocumentService.get_by_file_id(file_id)
+            if not informs:
+                return get_data_error_result(message="Inform not found!")
+            for inform in informs:
+                if not inform:
+                    return get_data_error_result(message="Inform not found!")
+                File2DocumentService.delete_by_file_id(file_id)
+                doc_id = inform.document_id
+                e, doc = DocumentService.get_by_id(doc_id)
+                if not e:
+                    return get_data_error_result(message="Document not found!")
+                tenant_id = DocumentService.get_tenant_id(doc_id)
+                if not tenant_id:
+                    return get_data_error_result(message="Tenant not found!")
+                if not DocumentService.remove_document(doc, tenant_id):
+                    return get_data_error_result(
+                        message="Database error (Document removal)!")
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
--- a/api/apps/file_app.py
+++ b/api/apps/file_app.py
@ -0,0 +1,373 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License
+#
+import os
+import pathlib
+import re
+
+import flask
+from flask import request
+from flask_login import login_required, current_user
+
+from api.db.services.document_service import DocumentService
+from api.db.services.file2document_service import File2DocumentService
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
+from api.utils import get_uuid
+from api.db import FileType, FileSource
+from api.db.services import duplicate_name
+from api.db.services.file_service import FileService
+from api import settings
+from api.utils.api_utils import get_json_result
+from api.utils.file_utils import filename_type
+from rag.utils.storage_factory import STORAGE_IMPL
+
+
+@manager.route('/upload', methods=['POST'])  # noqa: F821
+@login_required
+# @validate_request("parent_id")
+def upload():
+    pf_id = request.form.get("parent_id")
+
+    if not pf_id:
+        root_folder = FileService.get_root_folder(current_user.id)
+        pf_id = root_folder["id"]
+
+    if 'file' not in request.files:
+        return get_json_result(
+            data=False, message='No file part!', code=settings.RetCode.ARGUMENT_ERROR)
+    file_objs = request.files.getlist('file')
+
+    for file_obj in file_objs:
+        if file_obj.filename == '':
+            return get_json_result(
+                data=False, message='No file selected!', code=settings.RetCode.ARGUMENT_ERROR)
+    file_res = []
+    try:
+        for file_obj in file_objs:
+            e, file = FileService.get_by_id(pf_id)
+            if not e:
+                return get_data_error_result(
+                    message="Can't find this folder!")
+            MAX_FILE_NUM_PER_USER = int(os.environ.get('MAX_FILE_NUM_PER_USER', 0))
+            if MAX_FILE_NUM_PER_USER > 0 and DocumentService.get_doc_count(current_user.id) >= MAX_FILE_NUM_PER_USER:
+                return get_data_error_result(
+                    message="Exceed the maximum file number of a free user!")
+
+            # split file name path
+            if not file_obj.filename:
+                e, file = FileService.get_by_id(pf_id)
+                file_obj_names = [file.name, file_obj.filename]
+            else:
+                full_path = '/' + file_obj.filename
+                file_obj_names = full_path.split('/')
+            file_len = len(file_obj_names)
+
+            # get folder
+            file_id_list = FileService.get_id_list_by_id(pf_id, file_obj_names, 1, [pf_id])
+            len_id_list = len(file_id_list)
+
+            # create folder
+            if file_len != len_id_list:
+                e, file = FileService.get_by_id(file_id_list[len_id_list - 1])
+                if not e:
+                    return get_data_error_result(message="Folder not found!")
+                last_folder = FileService.create_folder(file, file_id_list[len_id_list - 1], file_obj_names,
+                                                        len_id_list)
+            else:
+                e, file = FileService.get_by_id(file_id_list[len_id_list - 2])
+                if not e:
+                    return get_data_error_result(message="Folder not found!")
+                last_folder = FileService.create_folder(file, file_id_list[len_id_list - 2], file_obj_names,
+                                                        len_id_list)
+
+            # file type
+            filetype = filename_type(file_obj_names[file_len - 1])
+            location = file_obj_names[file_len - 1]
+            while STORAGE_IMPL.obj_exist(last_folder.id, location):
+                location += "_"
+            blob = file_obj.read()
+            filename = duplicate_name(
+                FileService.query,
+                name=file_obj_names[file_len - 1],
+                parent_id=last_folder.id)
+            file = {
+                "id": get_uuid(),
+                "parent_id": last_folder.id,
+                "tenant_id": current_user.id,
+                "created_by": current_user.id,
+                "type": filetype,
+                "name": filename,
+                "location": location,
+                "size": len(blob),
+            }
+            file = FileService.insert(file)
+            STORAGE_IMPL.put(last_folder.id, location, blob)
+            file_res.append(file.to_json())
+        return get_json_result(data=file_res)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/create', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("name")
+def create():
+    req = request.json
+    pf_id = request.json.get("parent_id")
+    input_file_type = request.json.get("type")
+    if not pf_id:
+        root_folder = FileService.get_root_folder(current_user.id)
+        pf_id = root_folder["id"]
+
+    try:
+        if not FileService.is_parent_folder_exist(pf_id):
+            return get_json_result(
+                data=False, message="Parent Folder Doesn't Exist!", code=settings.RetCode.OPERATING_ERROR)
+        if FileService.query(name=req["name"], parent_id=pf_id):
+            return get_data_error_result(
+                message="Duplicated folder name in the same folder.")
+
+        if input_file_type == FileType.FOLDER.value:
+            file_type = FileType.FOLDER.value
+        else:
+            file_type = FileType.VIRTUAL.value
+
+        file = FileService.insert({
+            "id": get_uuid(),
+            "parent_id": pf_id,
+            "tenant_id": current_user.id,
+            "created_by": current_user.id,
+            "name": req["name"],
+            "location": "",
+            "size": 0,
+            "type": file_type
+        })
+
+        return get_json_result(data=file.to_json())
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def list_files():
+    pf_id = request.args.get("parent_id")
+
+    keywords = request.args.get("keywords", "")
+
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 15))
+    orderby = request.args.get("orderby", "create_time")
+    desc = request.args.get("desc", True)
+    if not pf_id:
+        root_folder = FileService.get_root_folder(current_user.id)
+        pf_id = root_folder["id"]
+        FileService.init_knowledgebase_docs(pf_id, current_user.id)
+    try:
+        e, file = FileService.get_by_id(pf_id)
+        if not e:
+            return get_data_error_result(message="Folder not found!")
+
+        files, total = FileService.get_by_pf_id(
+            current_user.id, pf_id, page_number, items_per_page, orderby, desc, keywords)
+
+        parent_folder = FileService.get_parent_folder(pf_id)
+        if not FileService.get_parent_folder(pf_id):
+            return get_json_result(message="File not found!")
+
+        return get_json_result(data={"total": total, "files": files, "parent_folder": parent_folder.to_json()})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/root_folder', methods=['GET'])  # noqa: F821
+@login_required
+def get_root_folder():
+    try:
+        root_folder = FileService.get_root_folder(current_user.id)
+        return get_json_result(data={"root_folder": root_folder})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/parent_folder', methods=['GET'])  # noqa: F821
+@login_required
+def get_parent_folder():
+    file_id = request.args.get("file_id")
+    try:
+        e, file = FileService.get_by_id(file_id)
+        if not e:
+            return get_data_error_result(message="Folder not found!")
+
+        parent_folder = FileService.get_parent_folder(file_id)
+        return get_json_result(data={"parent_folder": parent_folder.to_json()})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/all_parent_folder', methods=['GET'])  # noqa: F821
+@login_required
+def get_all_parent_folders():
+    file_id = request.args.get("file_id")
+    try:
+        e, file = FileService.get_by_id(file_id)
+        if not e:
+            return get_data_error_result(message="Folder not found!")
+
+        parent_folders = FileService.get_all_parent_folders(file_id)
+        parent_folders_res = []
+        for parent_folder in parent_folders:
+            parent_folders_res.append(parent_folder.to_json())
+        return get_json_result(data={"parent_folders": parent_folders_res})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("file_ids")
+def rm():
+    req = request.json
+    file_ids = req["file_ids"]
+    try:
+        for file_id in file_ids:
+            e, file = FileService.get_by_id(file_id)
+            if not e:
+                return get_data_error_result(message="File or Folder not found!")
+            if not file.tenant_id:
+                return get_data_error_result(message="Tenant not found!")
+            if file.source_type == FileSource.KNOWLEDGEBASE:
+                continue
+
+            if file.type == FileType.FOLDER.value:
+                file_id_list = FileService.get_all_innermost_file_ids(file_id, [])
+                for inner_file_id in file_id_list:
+                    e, file = FileService.get_by_id(inner_file_id)
+                    if not e:
+                        return get_data_error_result(message="File not found!")
+                    STORAGE_IMPL.rm(file.parent_id, file.location)
+                FileService.delete_folder_by_pf_id(current_user.id, file_id)
+            else:
+                if not FileService.delete(file):
+                    return get_data_error_result(
+                        message="Database error (File removal)!")
+
+            # delete file2document
+            informs = File2DocumentService.get_by_file_id(file_id)
+            for inform in informs:
+                doc_id = inform.document_id
+                e, doc = DocumentService.get_by_id(doc_id)
+                if not e:
+                    return get_data_error_result(message="Document not found!")
+                tenant_id = DocumentService.get_tenant_id(doc_id)
+                if not tenant_id:
+                    return get_data_error_result(message="Tenant not found!")
+                if not DocumentService.remove_document(doc, tenant_id):
+                    return get_data_error_result(
+                        message="Database error (Document removal)!")
+            File2DocumentService.delete_by_file_id(file_id)
+
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rename', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("file_id", "name")
+def rename():
+    req = request.json
+    try:
+        e, file = FileService.get_by_id(req["file_id"])
+        if not e:
+            return get_data_error_result(message="File not found!")
+        if file.type != FileType.FOLDER.value \
+            and pathlib.Path(req["name"].lower()).suffix != pathlib.Path(
+                file.name.lower()).suffix:
+            return get_json_result(
+                data=False,
+                message="The extension of file can't be changed",
+                code=settings.RetCode.ARGUMENT_ERROR)
+        for file in FileService.query(name=req["name"], pf_id=file.parent_id):
+            if file.name == req["name"]:
+                return get_data_error_result(
+                    message="Duplicated file name in the same folder.")
+
+        if not FileService.update_by_id(
+                req["file_id"], {"name": req["name"]}):
+            return get_data_error_result(
+                message="Database error (File rename)!")
+
+        informs = File2DocumentService.get_by_file_id(req["file_id"])
+        if informs:
+            if not DocumentService.update_by_id(
+                    informs[0].document_id, {"name": req["name"]}):
+                return get_data_error_result(
+                    message="Database error (Document rename)!")
+
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/get/<file_id>', methods=['GET'])  # noqa: F821
+@login_required
+def get(file_id):
+    try:
+        e, file = FileService.get_by_id(file_id)
+        if not e:
+            return get_data_error_result(message="Document not found!")
+
+        blob = STORAGE_IMPL.get(file.parent_id, file.location)
+        if not blob:
+            b, n = File2DocumentService.get_storage_address(file_id=file_id)
+            blob = STORAGE_IMPL.get(b, n)
+
+        response = flask.make_response(blob)
+        ext = re.search(r"\.([^.]+)$", file.name)
+        if ext:
+            if file.type == FileType.VISUAL.value:
+                response.headers.set('Content-Type', 'image/%s' % ext.group(1))
+            else:
+                response.headers.set(
+                    'Content-Type',
+                    'application/%s' %
+                    ext.group(1))
+        return response
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/mv', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("src_file_ids", "dest_file_id")
+def move():
+    req = request.json
+    try:
+        file_ids = req["src_file_ids"]
+        parent_id = req["dest_file_id"]
+        for file_id in file_ids:
+            e, file = FileService.get_by_id(file_id)
+            if not e:
+                return get_data_error_result(message="File or Folder not found!")
+            if not file.tenant_id:
+                return get_data_error_result(message="Tenant not found!")
+        fe, _ = FileService.get_by_id(parent_id)
+        if not fe:
+            return get_data_error_result(message="Parent Folder not found!")
+        FileService.move_file(file_ids, parent_id)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
--- a/api/apps/kb_app.py
+++ b/api/apps/kb_app.py
@ -0,0 +1,325 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import json
+import os
+
+from flask import request
+from flask_login import login_required, current_user
+
+from api.db.services import duplicate_name
+from api.db.services.document_service import DocumentService
+from api.db.services.file2document_service import File2DocumentService
+from api.db.services.file_service import FileService
+from api.db.services.user_service import TenantService, UserTenantService
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request, not_allowed_parameters
+from api.utils import get_uuid
+from api.db import StatusEnum, FileSource
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.db_models import File
+from api.utils.api_utils import get_json_result
+from api import settings
+from rag.nlp import search
+from api.constants import DATASET_NAME_LIMIT
+from rag.settings import PAGERANK_FLD
+
+
+@manager.route('/create', methods=['post'])  # noqa: F821
+@login_required
+@validate_request("name")
+def create():
+    req = request.json
+    dataset_name = req["name"]
+    if not isinstance(dataset_name, str):
+        return get_data_error_result(message="Dataset name must be string.")
+    if dataset_name == "":
+        return get_data_error_result(message="Dataset name can't be empty.")
+    if len(dataset_name) >= DATASET_NAME_LIMIT:
+        return get_data_error_result(
+            message=f"Dataset name length is {len(dataset_name)} which is large than {DATASET_NAME_LIMIT}")
+
+    dataset_name = dataset_name.strip()
+    dataset_name = duplicate_name(
+        KnowledgebaseService.query,
+        name=dataset_name,
+        tenant_id=current_user.id,
+        status=StatusEnum.VALID.value)
+    try:
+        req["id"] = get_uuid()
+        req["tenant_id"] = current_user.id
+        req["created_by"] = current_user.id
+        e, t = TenantService.get_by_id(current_user.id)
+        if not e:
+            return get_data_error_result(message="Tenant not found.")
+        req["embd_id"] = t.embd_id
+        if not KnowledgebaseService.save(**req):
+            return get_data_error_result()
+        return get_json_result(data={"kb_id": req["id"]})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/update', methods=['post'])  # noqa: F821
+@login_required
+@validate_request("kb_id", "name", "description", "permission", "parser_id")
+@not_allowed_parameters("id", "tenant_id", "created_by", "create_time", "update_time", "create_date", "update_date", "created_by")
+def update():
+    req = request.json
+    req["name"] = req["name"].strip()
+    if not KnowledgebaseService.accessible4deletion(req["kb_id"], current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    try:
+        if not KnowledgebaseService.query(
+                created_by=current_user.id, id=req["kb_id"]):
+            return get_json_result(
+                data=False, message='Only owner of knowledgebase authorized for this operation.',
+                code=settings.RetCode.OPERATING_ERROR)
+
+        e, kb = KnowledgebaseService.get_by_id(req["kb_id"])
+        if not e:
+            return get_data_error_result(
+                message="Can't find this knowledgebase!")
+
+        if req.get("parser_id", "") == "tag" and os.environ.get('DOC_ENGINE', "elasticsearch") == "infinity":
+            return get_json_result(
+                data=False,
+                message='The chunk method Tag has not been supported by Infinity yet.',
+                code=settings.RetCode.OPERATING_ERROR
+            )
+
+        if req["name"].lower() != kb.name.lower() \
+                and len(
+            KnowledgebaseService.query(name=req["name"], tenant_id=current_user.id, status=StatusEnum.VALID.value)) > 1:
+            return get_data_error_result(
+                message="Duplicated knowledgebase name.")
+
+        del req["kb_id"]
+        if not KnowledgebaseService.update_by_id(kb.id, req):
+            return get_data_error_result()
+
+        if kb.pagerank != req.get("pagerank", 0):
+            if req.get("pagerank", 0) > 0:
+                settings.docStoreConn.update({"kb_id": kb.id}, {PAGERANK_FLD: req["pagerank"]},
+                                         search.index_name(kb.tenant_id), kb.id)
+            else:
+                # Elasticsearch requires PAGERANK_FLD be non-zero!
+                settings.docStoreConn.update({"exists": PAGERANK_FLD}, {"remove": PAGERANK_FLD},
+                                         search.index_name(kb.tenant_id), kb.id)
+
+        e, kb = KnowledgebaseService.get_by_id(kb.id)
+        if not e:
+            return get_data_error_result(
+                message="Database error (Knowledgebase rename)!")
+        kb = kb.to_dict()
+        kb.update(req)
+
+        return get_json_result(data=kb)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/detail', methods=['GET'])  # noqa: F821
+@login_required
+def detail():
+    kb_id = request.args["kb_id"]
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        for tenant in tenants:
+            if KnowledgebaseService.query(
+                    tenant_id=tenant.tenant_id, id=kb_id):
+                break
+        else:
+            return get_json_result(
+                data=False, message='Only owner of knowledgebase authorized for this operation.',
+                code=settings.RetCode.OPERATING_ERROR)
+        kb = KnowledgebaseService.get_detail(kb_id)
+        if not kb:
+            return get_data_error_result(
+                message="Can't find this knowledgebase!")
+        return get_json_result(data=kb)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def list_kbs():
+    keywords = request.args.get("keywords", "")
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 150))
+    parser_id = request.args.get("parser_id")
+    orderby = request.args.get("orderby", "create_time")
+    desc = request.args.get("desc", True)
+    try:
+        tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
+        kbs, total = KnowledgebaseService.get_by_tenant_ids(
+            [m["tenant_id"] for m in tenants], current_user.id, page_number,
+            items_per_page, orderby, desc, keywords, parser_id)
+        return get_json_result(data={"kbs": kbs, "total": total})
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/rm', methods=['post'])  # noqa: F821
+@login_required
+@validate_request("kb_id")
+def rm():
+    req = request.json
+    if not KnowledgebaseService.accessible4deletion(req["kb_id"], current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    try:
+        kbs = KnowledgebaseService.query(
+            created_by=current_user.id, id=req["kb_id"])
+        if not kbs:
+            return get_json_result(
+                data=False, message='Only owner of knowledgebase authorized for this operation.',
+                code=settings.RetCode.OPERATING_ERROR)
+
+        for doc in DocumentService.query(kb_id=req["kb_id"]):
+            if not DocumentService.remove_document(doc, kbs[0].tenant_id):
+                return get_data_error_result(
+                    message="Database error (Document removal)!")
+            f2d = File2DocumentService.get_by_document_id(doc.id)
+            if f2d:
+                FileService.filter_delete([File.source_type == FileSource.KNOWLEDGEBASE, File.id == f2d[0].file_id])
+            File2DocumentService.delete_by_document_id(doc.id)
+        FileService.filter_delete(
+            [File.source_type == FileSource.KNOWLEDGEBASE, File.type == "folder", File.name == kbs[0].name])
+        if not KnowledgebaseService.delete_by_id(req["kb_id"]):
+            return get_data_error_result(
+                message="Database error (Knowledgebase removal)!")
+        for kb in kbs:
+            settings.docStoreConn.delete({"kb_id": kb.id}, search.index_name(kb.tenant_id), kb.id)
+            settings.docStoreConn.deleteIdx(search.index_name(kb.tenant_id), kb.id)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/<kb_id>/tags', methods=['GET'])  # noqa: F821
+@login_required
+def list_tags(kb_id):
+    if not KnowledgebaseService.accessible(kb_id, current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+
+    tags = settings.retrievaler.all_tags(current_user.id, [kb_id])
+    return get_json_result(data=tags)
+
+
+@manager.route('/tags', methods=['GET'])  # noqa: F821
+@login_required
+def list_tags_from_kbs():
+    kb_ids = request.args.get("kb_ids", "").split(",")
+    for kb_id in kb_ids:
+        if not KnowledgebaseService.accessible(kb_id, current_user.id):
+            return get_json_result(
+                data=False,
+                message='No authorization.',
+                code=settings.RetCode.AUTHENTICATION_ERROR
+            )
+
+    tags = settings.retrievaler.all_tags(current_user.id, kb_ids)
+    return get_json_result(data=tags)
+
+
+@manager.route('/<kb_id>/rm_tags', methods=['POST'])  # noqa: F821
+@login_required
+def rm_tags(kb_id):
+    req = request.json
+    if not KnowledgebaseService.accessible(kb_id, current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    e, kb = KnowledgebaseService.get_by_id(kb_id)
+
+    for t in req["tags"]:
+        settings.docStoreConn.update({"tag_kwd": t, "kb_id": [kb_id]},
+                                     {"remove": {"tag_kwd": t}},
+                                     search.index_name(kb.tenant_id),
+                                     kb_id)
+    return get_json_result(data=True)
+
+
+@manager.route('/<kb_id>/rename_tag', methods=['POST'])  # noqa: F821
+@login_required
+def rename_tags(kb_id):
+    req = request.json
+    if not KnowledgebaseService.accessible(kb_id, current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    e, kb = KnowledgebaseService.get_by_id(kb_id)
+
+    settings.docStoreConn.update({"tag_kwd": req["from_tag"], "kb_id": [kb_id]},
+                                     {"remove": {"tag_kwd": req["from_tag"].strip()}, "add": {"tag_kwd": req["to_tag"]}},
+                                     search.index_name(kb.tenant_id),
+                                     kb_id)
+    return get_json_result(data=True)
+
+
+@manager.route('/<kb_id>/knowledge_graph', methods=['GET'])  # noqa: F821
+@login_required
+def knowledge_graph(kb_id):
+    if not KnowledgebaseService.accessible(kb_id, current_user.id):
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR
+        )
+    _, kb = KnowledgebaseService.get_by_id(kb_id)
+    req = {
+        "kb_id": [kb_id],
+        "knowledge_graph_kwd": ["graph"]
+    }
+
+    obj = {"graph": {}, "mind_map": {}}
+    if not settings.docStoreConn.indexExist(search.index_name(kb.tenant_id), kb_id):
+        return get_json_result(data=obj)
+    sres = settings.retrievaler.search(req, search.index_name(kb.tenant_id), [kb_id])
+    if not len(sres.ids):
+        return get_json_result(data=obj)
+
+    for id in sres.ids[:1]:
+        ty = sres.field[id]["knowledge_graph_kwd"]
+        try:
+            content_json = json.loads(sres.field[id]["content_with_weight"])
+        except Exception:
+            continue
+
+        obj[ty] = content_json
+
+    if "nodes" in obj["graph"]:
+        obj["graph"]["nodes"] = sorted(obj["graph"]["nodes"], key=lambda x: x.get("pagerank", 0), reverse=True)[:256]
+        if "edges" in obj["graph"]:
+            node_id_set = { o["id"] for o in obj["graph"]["nodes"] }
+            filtered_edges = [o for o in obj["graph"]["edges"] if o["source"] != o["target"] and o["source"] in node_id_set and o["target"] in node_id_set]
+            obj["graph"]["edges"] = sorted(filtered_edges, key=lambda x: x.get("weight", 0), reverse=True)[:128]
+    return get_json_result(data=obj)
--- a/api/apps/llm_app.py
+++ b/api/apps/llm_app.py
@ -0,0 +1,370 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+import json
+import os
+from flask import request
+from flask_login import login_required, current_user
+from api.db.services.llm_service import LLMFactoriesService, TenantLLMService, LLMService
+from api import settings
+from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
+from api.db import StatusEnum, LLMType
+from api.db.db_models import TenantLLM
+from api.utils.api_utils import get_json_result
+from api.utils.file_utils import get_project_base_directory
+from rag.llm import EmbeddingModel, ChatModel, RerankModel, CvModel, TTSModel
+
+
+@manager.route('/factories', methods=['GET'])  # noqa: F821
+@login_required
+def factories():
+    try:
+        fac = LLMFactoriesService.get_all()
+        fac = [f.to_dict() for f in fac if f.name not in ["Youdao", "FastEmbed", "BAAI"]]
+        llms = LLMService.get_all()
+        mdl_types = {}
+        for m in llms:
+            if m.status != StatusEnum.VALID.value:
+                continue
+            if m.fid not in mdl_types:
+                mdl_types[m.fid] = set([])
+            mdl_types[m.fid].add(m.model_type)
+        for f in fac:
+            f["model_types"] = list(mdl_types.get(f["name"], [LLMType.CHAT, LLMType.EMBEDDING, LLMType.RERANK,
+                                                              LLMType.IMAGE2TEXT, LLMType.SPEECH2TEXT, LLMType.TTS]))
+        return get_json_result(data=fac)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/set_api_key', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("llm_factory", "api_key")
+def set_api_key():
+    req = request.json
+    # test if api key works
+    chat_passed, embd_passed, rerank_passed = False, False, False
+    factory = req["llm_factory"]
+    msg = ""
+    for llm in LLMService.query(fid=factory):
+        if not embd_passed and llm.model_type == LLMType.EMBEDDING.value:
+            mdl = EmbeddingModel[factory](
+                req["api_key"], llm.llm_name, base_url=req.get("base_url"))
+            try:
+                arr, tc = mdl.encode(["Test if the api key is available"])
+                if len(arr[0]) == 0:
+                    raise Exception("Fail")
+                embd_passed = True
+            except Exception as e:
+                msg += f"\nFail to access embedding model({llm.llm_name}) using this api key." + str(e)
+        elif not chat_passed and llm.model_type == LLMType.CHAT.value:
+            mdl = ChatModel[factory](
+                req["api_key"], llm.llm_name, base_url=req.get("base_url"))
+            try:
+                m, tc = mdl.chat(None, [{"role": "user", "content": "Hello! How are you doing!"}],
+                                 {"temperature": 0.9, 'max_tokens': 50})
+                if m.find("**ERROR**") >= 0:
+                    raise Exception(m)
+                chat_passed = True
+            except Exception as e:
+                msg += f"\nFail to access model({llm.llm_name}) using this api key." + str(
+                    e)
+        elif not rerank_passed and llm.model_type == LLMType.RERANK:
+            mdl = RerankModel[factory](
+                req["api_key"], llm.llm_name, base_url=req.get("base_url"))
+            try:
+                arr, tc = mdl.similarity("What's the weather?", ["Is it sunny today?"])
+                if len(arr) == 0 or tc == 0:
+                    raise Exception("Fail")
+                rerank_passed = True
+                logging.debug(f'passed model rerank {llm.llm_name}')
+            except Exception as e:
+                msg += f"\nFail to access model({llm.llm_name}) using this api key." + str(
+                    e)
+        if any([embd_passed, chat_passed, rerank_passed]):
+            msg = ''
+            break
+
+    if msg:
+        return get_data_error_result(message=msg)
+
+    llm_config = {
+        "api_key": req["api_key"],
+        "api_base": req.get("base_url", "")
+    }
+    for n in ["model_type", "llm_name"]:
+        if n in req:
+            llm_config[n] = req[n]
+
+    for llm in LLMService.query(fid=factory):
+        llm_config["max_tokens"]=llm.max_tokens
+        if not TenantLLMService.filter_update(
+                [TenantLLM.tenant_id == current_user.id,
+                 TenantLLM.llm_factory == factory,
+                 TenantLLM.llm_name == llm.llm_name],
+                llm_config):
+            TenantLLMService.save(
+                tenant_id=current_user.id,
+                llm_factory=factory,
+                llm_name=llm.llm_name,
+                model_type=llm.model_type,
+                api_key=llm_config["api_key"],
+                api_base=llm_config["api_base"],
+                max_tokens=llm_config["max_tokens"]
+            )
+
+    return get_json_result(data=True)
+
+
+@manager.route('/add_llm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("llm_factory")
+def add_llm():
+    req = request.json
+    factory = req["llm_factory"]
+
+    def apikey_json(keys):
+        nonlocal req
+        return json.dumps({k: req.get(k, "") for k in keys})
+
+    if factory == "VolcEngine":
+        # For VolcEngine, due to its special authentication method
+        # Assemble ark_api_key endpoint_id into api_key
+        llm_name = req["llm_name"]
+        api_key = apikey_json(["ark_api_key", "endpoint_id"])
+
+    elif factory == "Tencent Hunyuan":
+        req["api_key"] = apikey_json(["hunyuan_sid", "hunyuan_sk"])
+        return set_api_key()
+
+    elif factory == "Tencent Cloud":
+        req["api_key"] = apikey_json(["tencent_cloud_sid", "tencent_cloud_sk"])
+        return set_api_key()
+
+    elif factory == "Bedrock":
+        # For Bedrock, due to its special authentication method
+        # Assemble bedrock_ak, bedrock_sk, bedrock_region
+        llm_name = req["llm_name"]
+        api_key = apikey_json(["bedrock_ak", "bedrock_sk", "bedrock_region"])
+
+    elif factory == "LocalAI":
+        llm_name = req["llm_name"] + "___LocalAI"
+        api_key = "xxxxxxxxxxxxxxx"
+
+    elif factory == "HuggingFace":
+        llm_name = req["llm_name"] + "___HuggingFace"
+        api_key = "xxxxxxxxxxxxxxx"
+
+    elif factory == "OpenAI-API-Compatible":
+        llm_name = req["llm_name"] + "___OpenAI-API"
+        api_key = req.get("api_key", "xxxxxxxxxxxxxxx")
+
+    elif factory == "VLLM":
+        llm_name = req["llm_name"] + "___VLLM"
+        api_key = req.get("api_key", "xxxxxxxxxxxxxxx")
+
+    elif factory == "XunFei Spark":
+        llm_name = req["llm_name"]
+        if req["model_type"] == "chat":
+            api_key = req.get("spark_api_password", "xxxxxxxxxxxxxxx")
+        elif req["model_type"] == "tts":
+            api_key = apikey_json(["spark_app_id", "spark_api_secret", "spark_api_key"])
+
+    elif factory == "BaiduYiyan":
+        llm_name = req["llm_name"]
+        api_key = apikey_json(["yiyan_ak", "yiyan_sk"])
+
+    elif factory == "Fish Audio":
+        llm_name = req["llm_name"]
+        api_key = apikey_json(["fish_audio_ak", "fish_audio_refid"])
+
+    elif factory == "Google Cloud":
+        llm_name = req["llm_name"]
+        api_key = apikey_json(["google_project_id", "google_region", "google_service_account_key"])
+
+    elif factory == "Azure-OpenAI":
+        llm_name = req["llm_name"]
+        api_key = apikey_json(["api_key", "api_version"])
+
+    else:
+        llm_name = req["llm_name"]
+        api_key = req.get("api_key", "xxxxxxxxxxxxxxx")
+
+    llm = {
+        "tenant_id": current_user.id,
+        "llm_factory": factory,
+        "model_type": req["model_type"],
+        "llm_name": llm_name,
+        "api_base": req.get("api_base", ""),
+        "api_key": api_key,
+        "max_tokens": req.get("max_tokens")
+    }
+
+    msg = ""
+    mdl_nm = llm["llm_name"].split("___")[0]
+    if llm["model_type"] == LLMType.EMBEDDING.value:
+        mdl = EmbeddingModel[factory](
+            key=llm['api_key'],
+            model_name=mdl_nm,
+            base_url=llm["api_base"])
+        try:
+            arr, tc = mdl.encode(["Test if the api key is available"])
+            if len(arr[0]) == 0:
+                raise Exception("Fail")
+        except Exception as e:
+            msg += f"\nFail to access embedding model({mdl_nm})." + str(e)
+    elif llm["model_type"] == LLMType.CHAT.value:
+        mdl = ChatModel[factory](
+            key=llm['api_key'],
+            model_name=mdl_nm,
+            base_url=llm["api_base"]
+        )
+        try:
+            m, tc = mdl.chat(None, [{"role": "user", "content": "Hello! How are you doing!"}], {
+                "temperature": 0.9})
+            if not tc and m.find("**ERROR**:") >= 0:
+                raise Exception(m)
+        except Exception as e:
+            msg += f"\nFail to access model({mdl_nm})." + str(
+                e)
+    elif llm["model_type"] == LLMType.RERANK:
+        try:
+            mdl = RerankModel[factory](
+                key=llm["api_key"],
+                model_name=mdl_nm,
+                base_url=llm["api_base"]
+            )
+            arr, tc = mdl.similarity("Hello~ Ragflower!", ["Hi, there!", "Ohh, my friend!"])
+            if len(arr) == 0:
+                raise Exception("Not known.")
+        except KeyError:
+            msg += f"{factory} dose not support this model({mdl_nm})"
+        except Exception as e:
+            msg += f"\nFail to access model({mdl_nm})." + str(
+                e)
+    elif llm["model_type"] == LLMType.IMAGE2TEXT.value:
+        mdl = CvModel[factory](
+            key=llm["api_key"],
+            model_name=mdl_nm,
+            base_url=llm["api_base"]
+        )
+        try:
+            with open(os.path.join(get_project_base_directory(), "web/src/assets/yay.jpg"), "rb") as f:
+                m, tc = mdl.describe(f.read())
+                if not m and not tc:
+                    raise Exception(m)
+        except Exception as e:
+            msg += f"\nFail to access model({mdl_nm})." + str(e)
+    elif llm["model_type"] == LLMType.TTS:
+        mdl = TTSModel[factory](
+            key=llm["api_key"], model_name=mdl_nm, base_url=llm["api_base"]
+        )
+        try:
+            for resp in mdl.tts("Hello~ Ragflower!"):
+                pass
+        except RuntimeError as e:
+            msg += f"\nFail to access model({mdl_nm})." + str(e)
+    else:
+        # TODO: check other type of models
+        pass
+
+    if msg:
+        return get_data_error_result(message=msg)
+
+    if not TenantLLMService.filter_update(
+            [TenantLLM.tenant_id == current_user.id, TenantLLM.llm_factory == factory,
+             TenantLLM.llm_name == llm["llm_name"]], llm):
+        TenantLLMService.save(**llm)
+
+    return get_json_result(data=True)
+
+
+@manager.route('/delete_llm', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("llm_factory", "llm_name")
+def delete_llm():
+    req = request.json
+    TenantLLMService.filter_delete(
+        [TenantLLM.tenant_id == current_user.id, TenantLLM.llm_factory == req["llm_factory"],
+         TenantLLM.llm_name == req["llm_name"]])
+    return get_json_result(data=True)
+
+
+@manager.route('/delete_factory', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("llm_factory")
+def delete_factory():
+    req = request.json
+    TenantLLMService.filter_delete(
+        [TenantLLM.tenant_id == current_user.id, TenantLLM.llm_factory == req["llm_factory"]])
+    return get_json_result(data=True)
+
+
+@manager.route('/my_llms', methods=['GET'])  # noqa: F821
+@login_required
+def my_llms():
+    try:
+        res = {}
+        for o in TenantLLMService.get_my_llms(current_user.id):
+            if o["llm_factory"] not in res:
+                res[o["llm_factory"]] = {
+                    "tags": o["tags"],
+                    "llm": []
+                }
+            res[o["llm_factory"]]["llm"].append({
+                "type": o["model_type"],
+                "name": o["llm_name"],
+                "used_token": o["used_tokens"]
+            })
+        return get_json_result(data=res)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/list', methods=['GET'])  # noqa: F821
+@login_required
+def list_app():
+    self_deployed = ["Youdao", "FastEmbed", "BAAI", "Ollama", "Xinference", "LocalAI", "LM-Studio", "GPUStack"]
+    weighted = ["Youdao", "FastEmbed", "BAAI"] if settings.LIGHTEN != 0 else []
+    model_type = request.args.get("model_type")
+    try:
+        objs = TenantLLMService.query(tenant_id=current_user.id)
+        facts = set([o.to_dict()["llm_factory"] for o in objs if o.api_key])
+        llms = LLMService.get_all()
+        llms = [m.to_dict()
+                for m in llms if m.status == StatusEnum.VALID.value and m.fid not in weighted]
+        for m in llms:
+            m["available"] = m["fid"] in facts or m["llm_name"].lower() == "flag-embedding" or m["fid"] in self_deployed
+
+        llm_set = set([m["llm_name"] + "@" + m["fid"] for m in llms])
+        for o in objs:
+            if not o.api_key:
+                continue
+            if o.llm_name + "@" + o.llm_factory in llm_set:
+                continue
+            llms.append({"llm_name": o.llm_name, "model_type": o.model_type, "fid": o.llm_factory, "available": True})
+
+        res = {}
+        for m in llms:
+            if model_type and m["model_type"].find(model_type) < 0:
+                continue
+            if m["fid"] not in res:
+                res[m["fid"]] = []
+            res[m["fid"]].append(m)
+
+        return get_json_result(data=res)
+    except Exception as e:
+        return server_error_response(e)
--- a/api/apps/sdk/agent.py
+++ b/api/apps/sdk/agent.py
@ -0,0 +1,39 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from api.db.services.canvas_service import UserCanvasService
+from api.utils.api_utils import get_error_data_result, token_required
+from api.utils.api_utils import get_result
+from flask import request
+
+@manager.route('/agents', methods=['GET'])  # noqa: F821
+@token_required
+def list_agents(tenant_id):
+    id = request.args.get("id")
+    title = request.args.get("title")
+    if id or title:
+        canvas = UserCanvasService.query(id=id, title=title, user_id=tenant_id)
+        if not canvas:
+            return get_error_data_result("The agent doesn't exist.")
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 30))
+    orderby = request.args.get("orderby", "update_time")
+    if request.args.get("desc") == "False" or request.args.get("desc") == "false":
+        desc = False
+    else:
+        desc = True
+    canvas = UserCanvasService.get_list(tenant_id,page_number,items_per_page,orderby,desc,id,title)
+    return get_result(data=canvas)
--- a/api/apps/sdk/chat.py
+++ b/api/apps/sdk/chat.py
@ -0,0 +1,330 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+
+from flask import request
+from api import settings
+from api.db import StatusEnum
+from api.db.services.dialog_service import DialogService
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import TenantLLMService
+from api.db.services.user_service import TenantService
+from api.utils import get_uuid
+from api.utils.api_utils import get_error_data_result, token_required
+from api.utils.api_utils import get_result
+
+
+@manager.route('/chats', methods=['POST'])  # noqa: F821
+@token_required
+def create(tenant_id):
+    req = request.json
+    ids = req.get("dataset_ids")
+    if not ids:
+        return get_error_data_result(message="`dataset_ids` is required")
+    for kb_id in ids:
+        kbs = KnowledgebaseService.accessible(kb_id=kb_id, user_id=tenant_id)
+        if not kbs:
+            return get_error_data_result(f"You don't own the dataset {kb_id}")
+        kbs = KnowledgebaseService.query(id=kb_id)
+        kb = kbs[0]
+        if kb.chunk_num == 0:
+            return get_error_data_result(f"The dataset {kb_id} doesn't own parsed file")
+    kbs = KnowledgebaseService.get_by_ids(ids)
+    embd_ids = [TenantLLMService.split_model_name_and_factory(kb.embd_id)[0] for kb in kbs]  # remove vendor suffix for comparison
+    embd_count = list(set(embd_ids))
+    if len(embd_count) != 1:
+        return get_result(message='Datasets use different embedding models."',
+                          code=settings.RetCode.AUTHENTICATION_ERROR)
+    req["kb_ids"] = ids
+    # llm
+    llm = req.get("llm")
+    if llm:
+        if "model_name" in llm:
+            req["llm_id"] = llm.pop("model_name")
+            if not TenantLLMService.query(tenant_id=tenant_id, llm_name=req["llm_id"], model_type="chat"):
+                return get_error_data_result(f"`model_name` {req.get('llm_id')} doesn't exist")
+        req["llm_setting"] = req.pop("llm")
+    e, tenant = TenantService.get_by_id(tenant_id)
+    if not e:
+        return get_error_data_result(message="Tenant not found!")
+    # prompt
+    prompt = req.get("prompt")
+    key_mapping = {"parameters": "variables",
+                   "prologue": "opener",
+                   "quote": "show_quote",
+                   "system": "prompt",
+                   "rerank_id": "rerank_model",
+                   "vector_similarity_weight": "keywords_similarity_weight"}
+    key_list = ["similarity_threshold", "vector_similarity_weight", "top_n", "rerank_id","top_k"]
+    if prompt:
+        for new_key, old_key in key_mapping.items():
+            if old_key in prompt:
+                prompt[new_key] = prompt.pop(old_key)
+        for key in key_list:
+            if key in prompt:
+                req[key] = prompt.pop(key)
+        req["prompt_config"] = req.pop("prompt")
+    # init
+    req["id"] = get_uuid()
+    req["description"] = req.get("description", "A helpful Assistant")
+    req["icon"] = req.get("avatar", "")
+    req["top_n"] = req.get("top_n", 6)
+    req["top_k"] = req.get("top_k", 1024)
+    req["rerank_id"] = req.get("rerank_id", "")
+    if req.get("rerank_id"):
+        value_rerank_model = ["BAAI/bge-reranker-v2-m3", "maidalun1020/bce-reranker-base_v1"]
+        if req["rerank_id"] not in value_rerank_model and not TenantLLMService.query(tenant_id=tenant_id,
+                                                                                     llm_name=req.get("rerank_id"),
+                                                                                     model_type="rerank"):
+            return get_error_data_result(f"`rerank_model` {req.get('rerank_id')} doesn't exist")
+    if not req.get("llm_id"):
+        req["llm_id"] = tenant.llm_id
+    if not req.get("name"):
+        return get_error_data_result(message="`name` is required.")
+    if DialogService.query(name=req["name"], tenant_id=tenant_id, status=StatusEnum.VALID.value):
+        return get_error_data_result(message="Duplicated chat name in creating chat.")
+    # tenant_id
+    if req.get("tenant_id"):
+        return get_error_data_result(message="`tenant_id` must not be provided.")
+    req["tenant_id"] = tenant_id
+    # prompt more parameter
+    default_prompt = {
+        "system": """You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence "The answer you are looking for is not found in the knowledge base!" Answers need to consider chat history.
+      Here is the knowledge base:
+      {knowledge}
+      The above is the knowledge base.""",
+        "prologue": "Hi! I'm your assistant, what can I do for you?",
+        "parameters": [
+            {"key": "knowledge", "optional": False}
+        ],
+        "empty_response": "Sorry! No relevant content was found in the knowledge base!",
+        "quote": True,
+        "tts": False,
+        "refine_multiturn": True
+    }
+    key_list_2 = ["system", "prologue", "parameters", "empty_response", "quote", "tts", "refine_multiturn"]
+    if "prompt_config" not in req:
+        req['prompt_config'] = {}
+    for key in key_list_2:
+        temp = req['prompt_config'].get(key)
+        if (not temp and key == 'system') or (key not in req["prompt_config"]):
+            req['prompt_config'][key] = default_prompt[key]
+    for p in req['prompt_config']["parameters"]:
+        if p["optional"]:
+            continue
+        if req['prompt_config']["system"].find("{%s}" % p["key"]) < 0:
+            return get_error_data_result(
+                message="Parameter '{}' is not used".format(p["key"]))
+    # save
+    if not DialogService.save(**req):
+        return get_error_data_result(message="Fail to new a chat!")
+    # response
+    e, res = DialogService.get_by_id(req["id"])
+    if not e:
+        return get_error_data_result(message="Fail to new a chat!")
+    res = res.to_json()
+    renamed_dict = {}
+    for key, value in res["prompt_config"].items():
+        new_key = key_mapping.get(key, key)
+        renamed_dict[new_key] = value
+    res["prompt"] = renamed_dict
+    del res["prompt_config"]
+    new_dict = {"similarity_threshold": res["similarity_threshold"],
+                "keywords_similarity_weight": 1-res["vector_similarity_weight"],
+                "top_n": res["top_n"],
+                "rerank_model": res['rerank_id']}
+    res["prompt"].update(new_dict)
+    for key in key_list:
+        del res[key]
+    res["llm"] = res.pop("llm_setting")
+    res["llm"]["model_name"] = res.pop("llm_id")
+    del res["kb_ids"]
+    res["dataset_ids"] = req["dataset_ids"]
+    res["avatar"] = res.pop("icon")
+    return get_result(data=res)
+
+
+@manager.route('/chats/<chat_id>', methods=['PUT'])  # noqa: F821
+@token_required
+def update(tenant_id, chat_id):
+    if not DialogService.query(tenant_id=tenant_id, id=chat_id, status=StatusEnum.VALID.value):
+        return get_error_data_result(message='You do not own the chat')
+    req = request.json
+    ids = req.get("dataset_ids")
+    if "show_quotation" in req:
+        req["do_refer"] = req.pop("show_quotation")
+    if "dataset_ids" in req:
+        if not ids:
+            return get_error_data_result("`dataset_ids` can't be empty")
+        if ids:
+            for kb_id in ids:
+                kbs = KnowledgebaseService.accessible(kb_id=kb_id, user_id=tenant_id)
+                if not kbs:
+                    return get_error_data_result(f"You don't own the dataset {kb_id}")
+                kbs = KnowledgebaseService.query(id=kb_id)
+                kb = kbs[0]
+                if kb.chunk_num == 0:
+                    return get_error_data_result(f"The dataset {kb_id} doesn't own parsed file")
+            kbs = KnowledgebaseService.get_by_ids(ids)
+            embd_ids = [TenantLLMService.split_model_name_and_factory(kb.embd_id)[0] for kb in kbs]  # remove vendor suffix for comparison
+            embd_count = list(set(embd_ids))
+            if len(embd_count) != 1:
+                return get_result(
+                    message='Datasets use different embedding models."',
+                    code=settings.RetCode.AUTHENTICATION_ERROR)
+            req["kb_ids"] = ids
+    llm = req.get("llm")
+    if llm:
+        if "model_name" in llm:
+            req["llm_id"] = llm.pop("model_name")
+            if not TenantLLMService.query(tenant_id=tenant_id, llm_name=req["llm_id"], model_type="chat"):
+                return get_error_data_result(f"`model_name` {req.get('llm_id')} doesn't exist")
+        req["llm_setting"] = req.pop("llm")
+    e, tenant = TenantService.get_by_id(tenant_id)
+    if not e:
+        return get_error_data_result(message="Tenant not found!")
+    # prompt
+    prompt = req.get("prompt")
+    key_mapping = {"parameters": "variables",
+                   "prologue": "opener",
+                   "quote": "show_quote",
+                   "system": "prompt",
+                   "rerank_id": "rerank_model",
+                   "vector_similarity_weight": "keywords_similarity_weight"}
+    key_list = ["similarity_threshold", "vector_similarity_weight", "top_n", "rerank_id","top_k"]
+    if prompt:
+        for new_key, old_key in key_mapping.items():
+            if old_key in prompt:
+                prompt[new_key] = prompt.pop(old_key)
+        for key in key_list:
+            if key in prompt:
+                req[key] = prompt.pop(key)
+        req["prompt_config"] = req.pop("prompt")
+    e, res = DialogService.get_by_id(chat_id)
+    res = res.to_json()
+    if req.get("rerank_id"):
+        value_rerank_model = ["BAAI/bge-reranker-v2-m3", "maidalun1020/bce-reranker-base_v1"]
+        if req["rerank_id"] not in value_rerank_model and not TenantLLMService.query(tenant_id=tenant_id,
+                                                                                     llm_name=req.get("rerank_id"),
+                                                                                     model_type="rerank"):
+            return get_error_data_result(f"`rerank_model` {req.get('rerank_id')} doesn't exist")
+    if "name" in req:
+        if not req.get("name"):
+            return get_error_data_result(message="`name` is not empty.")
+        if req["name"].lower() != res["name"].lower() \
+                and len(
+            DialogService.query(name=req["name"], tenant_id=tenant_id, status=StatusEnum.VALID.value)) > 0:
+            return get_error_data_result(message="Duplicated chat name in updating dataset.")
+    if "prompt_config" in req:
+        res["prompt_config"].update(req["prompt_config"])
+        for p in res["prompt_config"]["parameters"]:
+            if p["optional"]:
+                continue
+            if res["prompt_config"]["system"].find("{%s}" % p["key"]) < 0:
+                return get_error_data_result(message="Parameter '{}' is not used".format(p["key"]))
+    if "llm_setting" in req:
+        res["llm_setting"].update(req["llm_setting"])
+    req["prompt_config"] = res["prompt_config"]
+    req["llm_setting"] = res["llm_setting"]
+    # avatar
+    if "avatar" in req:
+        req["icon"] = req.pop("avatar")
+    if "dataset_ids" in req:
+        req.pop("dataset_ids")
+    if not DialogService.update_by_id(chat_id, req):
+        return get_error_data_result(message="Chat not found!")
+    return get_result()
+
+
+@manager.route('/chats', methods=['DELETE'])  # noqa: F821
+@token_required
+def delete(tenant_id):
+    req = request.json
+    if not req:
+        ids = None
+    else:
+        ids = req.get("ids")
+    if not ids:
+        id_list = []
+        dias = DialogService.query(tenant_id=tenant_id, status=StatusEnum.VALID.value)
+        for dia in dias:
+            id_list.append(dia.id)
+    else:
+        id_list = ids
+    for id in id_list:
+        if not DialogService.query(tenant_id=tenant_id, id=id, status=StatusEnum.VALID.value):
+            return get_error_data_result(message=f"You don't own the chat {id}")
+        temp_dict = {"status": StatusEnum.INVALID.value}
+        DialogService.update_by_id(id, temp_dict)
+    return get_result()
+
+
+@manager.route('/chats', methods=['GET'])  # noqa: F821
+@token_required
+def list_chat(tenant_id):
+    id = request.args.get("id")
+    name = request.args.get("name")
+    if id or name:
+        chat = DialogService.query(id=id, name=name, status=StatusEnum.VALID.value, tenant_id=tenant_id)
+        if not chat:
+            return get_error_data_result(message="The chat doesn't exist")
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 30))
+    orderby = request.args.get("orderby", "create_time")
+    if request.args.get("desc") == "False" or request.args.get("desc") == "false":
+        desc = False
+    else:
+        desc = True
+    chats = DialogService.get_list(tenant_id, page_number, items_per_page, orderby, desc, id, name)
+    if not chats:
+        return get_result(data=[])
+    list_assts = []
+    key_mapping = {"parameters": "variables",
+                   "prologue": "opener",
+                   "quote": "show_quote",
+                   "system": "prompt",
+                   "rerank_id": "rerank_model",
+                   "vector_similarity_weight": "keywords_similarity_weight",
+                   "do_refer": "show_quotation"}
+    key_list = ["similarity_threshold", "vector_similarity_weight", "top_n", "rerank_id"]
+    for res in chats:
+        renamed_dict = {}
+        for key, value in res["prompt_config"].items():
+            new_key = key_mapping.get(key, key)
+            renamed_dict[new_key] = value
+        res["prompt"] = renamed_dict
+        del res["prompt_config"]
+        new_dict = {"similarity_threshold": res["similarity_threshold"],
+                    "keywords_similarity_weight": 1-res["vector_similarity_weight"],
+                    "top_n": res["top_n"],
+                    "rerank_model": res['rerank_id']}
+        res["prompt"].update(new_dict)
+        for key in key_list:
+            del res[key]
+        res["llm"] = res.pop("llm_setting")
+        res["llm"]["model_name"] = res.pop("llm_id")
+        kb_list = []
+        for kb_id in res["kb_ids"]:
+            kb = KnowledgebaseService.query(id=kb_id)
+            if not kb:
+                logging.WARN(f"Don't exist the kb {kb_id}")
+                continue
+            kb_list.append(kb[0].to_json())
+        del res["kb_ids"]
+        res["datasets"] = kb_list
+        res["avatar"] = res.pop("icon")
+        list_assts.append(res)
+    return get_result(data=list_assts)
--- a/api/apps/sdk/dataset.py
+++ b/api/apps/sdk/dataset.py
@ -0,0 +1,535 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from flask import request
+from api.db import StatusEnum, FileSource
+from api.db.db_models import File
+from api.db.services.document_service import DocumentService
+from api.db.services.file2document_service import File2DocumentService
+from api.db.services.file_service import FileService
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import TenantLLMService, LLMService
+from api.db.services.user_service import TenantService
+from api import settings
+from api.utils import get_uuid
+from api.utils.api_utils import (
+    get_result,
+    token_required,
+    get_error_data_result,
+    valid,
+    get_parser_config,
+)
+
+
+@manager.route("/datasets", methods=["POST"])  # noqa: F821
+@token_required
+def create(tenant_id):
+    """
+    Create a new dataset.
+    ---
+    tags:
+      - Datasets
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: header
+        name: Authorization
+        type: string
+        required: true
+        description: Bearer token for authentication.
+      - in: body
+        name: body
+        description: Dataset creation parameters.
+        required: true
+        schema:
+          type: object
+          required:
+            - name
+          properties:
+            name:
+              type: string
+              description: Name of the dataset.
+            permission:
+              type: string
+              enum: ['me', 'team']
+              description: Dataset permission.
+            language:
+              type: string
+              enum: ['Chinese', 'English']
+              description: Language of the dataset.
+            chunk_method:
+              type: string
+              enum: ["naive", "manual", "qa", "table", "paper", "book", "laws",
+                     "presentation", "picture", "one", "knowledge_graph", "email", "tag"
+                     ]
+              description: Chunking method.
+            parser_config:
+              type: object
+              description: Parser configuration.
+    responses:
+      200:
+        description: Successful operation.
+        schema:
+          type: object
+          properties:
+            data:
+              type: object
+    """
+    req = request.json
+    e, t = TenantService.get_by_id(tenant_id)
+    permission = req.get("permission")
+    language = req.get("language")
+    chunk_method = req.get("chunk_method")
+    parser_config = req.get("parser_config")
+    valid_permission = ["me", "team"]
+    valid_language = ["Chinese", "English"]
+    valid_chunk_method = [
+        "naive",
+        "manual",
+        "qa",
+        "table",
+        "paper",
+        "book",
+        "laws",
+        "presentation",
+        "picture",
+        "one",
+        "knowledge_graph",
+        "email",
+        "tag"
+    ]
+    check_validation = valid(
+        permission,
+        valid_permission,
+        language,
+        valid_language,
+        chunk_method,
+        valid_chunk_method,
+    )
+    if check_validation:
+        return check_validation
+    req["parser_config"] = get_parser_config(chunk_method, parser_config)
+    if "tenant_id" in req:
+        return get_error_data_result(message="`tenant_id` must not be provided")
+    if "chunk_count" in req or "document_count" in req:
+        return get_error_data_result(
+            message="`chunk_count` or `document_count` must not be provided"
+        )
+    if "name" not in req:
+        return get_error_data_result(message="`name` is not empty!")
+    req["id"] = get_uuid()
+    req["name"] = req["name"].strip()
+    if req["name"] == "":
+        return get_error_data_result(message="`name` is not empty string!")
+    if KnowledgebaseService.query(
+        name=req["name"], tenant_id=tenant_id, status=StatusEnum.VALID.value
+    ):
+        return get_error_data_result(
+            message="Duplicated dataset name in creating dataset."
+        )
+    req["tenant_id"] = req["created_by"] = tenant_id
+    if not req.get("embedding_model"):
+        req["embedding_model"] = t.embd_id
+    else:
+        valid_embedding_models = [
+            "BAAI/bge-large-zh-v1.5",
+            "BAAI/bge-base-en-v1.5",
+            "BAAI/bge-large-en-v1.5",
+            "BAAI/bge-small-en-v1.5",
+            "BAAI/bge-small-zh-v1.5",
+            "jinaai/jina-embeddings-v2-base-en",
+            "jinaai/jina-embeddings-v2-small-en",
+            "nomic-ai/nomic-embed-text-v1.5",
+            "sentence-transformers/all-MiniLM-L6-v2",
+            "text-embedding-v2",
+            "text-embedding-v3",
+            "maidalun1020/bce-embedding-base_v1",
+        ]
+        embd_model = LLMService.query(
+            llm_name=req["embedding_model"], model_type="embedding"
+        )
+        if embd_model:
+            if req["embedding_model"] not in valid_embedding_models and not TenantLLMService.query(tenant_id=tenant_id,model_type="embedding",llm_name=req.get("embedding_model"),):
+                return get_error_data_result(f"`embedding_model` {req.get('embedding_model')} doesn't exist")
+        if not embd_model:
+            embd_model=TenantLLMService.query(tenant_id=tenant_id,model_type="embedding", llm_name=req.get("embedding_model"))
+        if not embd_model:
+            return get_error_data_result(
+                f"`embedding_model` {req.get('embedding_model')} doesn't exist"
+            )
+    key_mapping = {
+        "chunk_num": "chunk_count",
+        "doc_num": "document_count",
+        "parser_id": "chunk_method",
+        "embd_id": "embedding_model",
+    }
+    mapped_keys = {
+        new_key: req[old_key]
+        for new_key, old_key in key_mapping.items()
+        if old_key in req
+    }
+    req.update(mapped_keys)
+    if not KnowledgebaseService.save(**req):
+        return get_error_data_result(message="Create dataset error.(Database error)")
+    renamed_data = {}
+    e, k = KnowledgebaseService.get_by_id(req["id"])
+    for key, value in k.to_dict().items():
+        new_key = key_mapping.get(key, key)
+        renamed_data[new_key] = value
+    return get_result(data=renamed_data)
+
+
+@manager.route("/datasets", methods=["DELETE"])  # noqa: F821
+@token_required
+def delete(tenant_id):
+    """
+    Delete datasets.
+    ---
+    tags:
+      - Datasets
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: header
+        name: Authorization
+        type: string
+        required: true
+        description: Bearer token for authentication.
+      - in: body
+        name: body
+        description: Dataset deletion parameters.
+        required: true
+        schema:
+          type: object
+          properties:
+            ids:
+              type: array
+              items:
+                type: string
+              description: List of dataset IDs to delete.
+    responses:
+      200:
+        description: Successful operation.
+        schema:
+          type: object
+    """
+    req = request.json
+    if not req:
+        ids = None
+    else:
+        ids = req.get("ids")
+    if not ids:
+        id_list = []
+        kbs = KnowledgebaseService.query(tenant_id=tenant_id)
+        for kb in kbs:
+            id_list.append(kb.id)
+    else:
+        id_list = ids
+    for id in id_list:
+        kbs = KnowledgebaseService.query(id=id, tenant_id=tenant_id)
+        if not kbs:
+            return get_error_data_result(message=f"You don't own the dataset {id}")
+        for doc in DocumentService.query(kb_id=id):
+            if not DocumentService.remove_document(doc, tenant_id):
+                return get_error_data_result(
+                    message="Remove document error.(Database error)"
+                )
+            f2d = File2DocumentService.get_by_document_id(doc.id)
+            FileService.filter_delete(
+                [
+                    File.source_type == FileSource.KNOWLEDGEBASE,
+                    File.id == f2d[0].file_id,
+                ]
+            )
+            File2DocumentService.delete_by_document_id(doc.id)
+        FileService.filter_delete(
+            [File.source_type == FileSource.KNOWLEDGEBASE, File.type == "folder", File.name == kbs[0].name])
+        if not KnowledgebaseService.delete_by_id(id):
+            return get_error_data_result(message="Delete dataset error.(Database error)")
+    return get_result(code=settings.RetCode.SUCCESS)
+
+
+@manager.route("/datasets/<dataset_id>", methods=["PUT"])  # noqa: F821
+@token_required
+def update(tenant_id, dataset_id):
+    """
+    Update a dataset.
+    ---
+    tags:
+      - Datasets
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: path
+        name: dataset_id
+        type: string
+        required: true
+        description: ID of the dataset to update.
+      - in: header
+        name: Authorization
+        type: string
+        required: true
+        description: Bearer token for authentication.
+      - in: body
+        name: body
+        description: Dataset update parameters.
+        required: true
+        schema:
+          type: object
+          properties:
+            name:
+              type: string
+              description: New name of the dataset.
+            permission:
+              type: string
+              enum: ['me', 'team']
+              description: Updated permission.
+            language:
+              type: string
+              enum: ['Chinese', 'English']
+              description: Updated language.
+            chunk_method:
+              type: string
+              enum: ["naive", "manual", "qa", "table", "paper", "book", "laws",
+                     "presentation", "picture", "one", "knowledge_graph", "email", "tag"
+                     ]
+              description: Updated chunking method.
+            parser_config:
+              type: object
+              description: Updated parser configuration.
+    responses:
+      200:
+        description: Successful operation.
+        schema:
+          type: object
+    """
+    if not KnowledgebaseService.query(id=dataset_id, tenant_id=tenant_id):
+        return get_error_data_result(message="You don't own the dataset")
+    req = request.json
+    e, t = TenantService.get_by_id(tenant_id)
+    invalid_keys = {"id", "embd_id", "chunk_num", "doc_num", "parser_id"}
+    if any(key in req for key in invalid_keys):
+        return get_error_data_result(message="The input parameters are invalid.")
+    permission = req.get("permission")
+    language = req.get("language")
+    chunk_method = req.get("chunk_method")
+    parser_config = req.get("parser_config")
+    valid_permission = ["me", "team"]
+    valid_language = ["Chinese", "English"]
+    valid_chunk_method = [
+        "naive",
+        "manual",
+        "qa",
+        "table",
+        "paper",
+        "book",
+        "laws",
+        "presentation",
+        "picture",
+        "one",
+        "knowledge_graph",
+        "email",
+        "tag"
+    ]
+    check_validation = valid(
+        permission,
+        valid_permission,
+        language,
+        valid_language,
+        chunk_method,
+        valid_chunk_method,
+    )
+    if check_validation:
+        return check_validation
+    if "tenant_id" in req:
+        if req["tenant_id"] != tenant_id:
+            return get_error_data_result(message="Can't change `tenant_id`.")
+    e, kb = KnowledgebaseService.get_by_id(dataset_id)
+    if "parser_config" in req:
+        temp_dict = kb.parser_config
+        temp_dict.update(req["parser_config"])
+        req["parser_config"] = temp_dict
+    if "chunk_count" in req:
+        if req["chunk_count"] != kb.chunk_num:
+            return get_error_data_result(message="Can't change `chunk_count`.")
+        req.pop("chunk_count")
+    if "document_count" in req:
+        if req["document_count"] != kb.doc_num:
+            return get_error_data_result(message="Can't change `document_count`.")
+        req.pop("document_count")
+    if "chunk_method" in req:
+        if kb.chunk_num != 0 and req["chunk_method"] != kb.parser_id:
+            return get_error_data_result(
+                message="If `chunk_count` is not 0, `chunk_method` is not changeable."
+            )
+        req["parser_id"] = req.pop("chunk_method")
+        if req["parser_id"] != kb.parser_id:
+            if not req.get("parser_config"):
+                req["parser_config"] = get_parser_config(chunk_method, parser_config)
+    if "embedding_model" in req:
+        if kb.chunk_num != 0 and req["embedding_model"] != kb.embd_id:
+            return get_error_data_result(
+                message="If `chunk_count` is not 0, `embedding_model` is not changeable."
+            )
+        if not req.get("embedding_model"):
+            return get_error_data_result("`embedding_model` can't be empty")
+        valid_embedding_models = [
+            "BAAI/bge-large-zh-v1.5",
+            "BAAI/bge-base-en-v1.5",
+            "BAAI/bge-large-en-v1.5",
+            "BAAI/bge-small-en-v1.5",
+            "BAAI/bge-small-zh-v1.5",
+            "jinaai/jina-embeddings-v2-base-en",
+            "jinaai/jina-embeddings-v2-small-en",
+            "nomic-ai/nomic-embed-text-v1.5",
+            "sentence-transformers/all-MiniLM-L6-v2",
+            "text-embedding-v2",
+            "text-embedding-v3",
+            "maidalun1020/bce-embedding-base_v1",
+        ]
+        embd_model = LLMService.query(
+            llm_name=req["embedding_model"], model_type="embedding"
+        )
+        if embd_model:
+            if req["embedding_model"] not in valid_embedding_models and not TenantLLMService.query(tenant_id=tenant_id,model_type="embedding",llm_name=req.get("embedding_model"),):
+                return get_error_data_result(f"`embedding_model` {req.get('embedding_model')} doesn't exist")
+        if not embd_model:
+            embd_model=TenantLLMService.query(tenant_id=tenant_id,model_type="embedding", llm_name=req.get("embedding_model"))
+
+        if not embd_model:
+            return get_error_data_result(
+                f"`embedding_model` {req.get('embedding_model')} doesn't exist"
+            )
+        req["embd_id"] = req.pop("embedding_model")
+    if "name" in req:
+        req["name"] = req["name"].strip()
+        if (
+            req["name"].lower() != kb.name.lower()
+            and len(
+                KnowledgebaseService.query(
+                    name=req["name"], tenant_id=tenant_id, status=StatusEnum.VALID.value
+                )
+            )
+            > 0
+        ):
+            return get_error_data_result(
+                message="Duplicated dataset name in updating dataset."
+            )
+    if not KnowledgebaseService.update_by_id(kb.id, req):
+        return get_error_data_result(message="Update dataset error.(Database error)")
+    return get_result(code=settings.RetCode.SUCCESS)
+
+
+@manager.route("/datasets", methods=["GET"])  # noqa: F821
+@token_required
+def list(tenant_id):
+    """
+    List datasets.
+    ---
+    tags:
+      - Datasets
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: query
+        name: id
+        type: string
+        required: false
+        description: Dataset ID to filter.
+      - in: query
+        name: name
+        type: string
+        required: false
+        description: Dataset name to filter.
+      - in: query
+        name: page
+        type: integer
+        required: false
+        default: 1
+        description: Page number.
+      - in: query
+        name: page_size
+        type: integer
+        required: false
+        default: 1024
+        description: Number of items per page.
+      - in: query
+        name: orderby
+        type: string
+        required: false
+        default: "create_time"
+        description: Field to order by.
+      - in: query
+        name: desc
+        type: boolean
+        required: false
+        default: true
+        description: Order in descending.
+      - in: header
+        name: Authorization
+        type: string
+        required: true
+        description: Bearer token for authentication.
+    responses:
+      200:
+        description: Successful operation.
+        schema:
+          type: array
+          items:
+            type: object
+    """
+    id = request.args.get("id")
+    name = request.args.get("name")
+    if id:
+        kbs = KnowledgebaseService.get_kb_by_id(id,tenant_id)
+        if not kbs:
+            return get_error_data_result(f"You don't own the dataset {id}")
+    if name:
+        kbs = KnowledgebaseService.get_kb_by_name(name,tenant_id)
+        if not kbs:
+            return get_error_data_result(f"You don't own the dataset {name}")
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 30))
+    orderby = request.args.get("orderby", "create_time")
+    if request.args.get("desc") == "False" or request.args.get("desc") == "false":
+        desc = False
+    else:
+        desc = True
+    tenants = TenantService.get_joined_tenants_by_user_id(tenant_id)
+    kbs = KnowledgebaseService.get_list(
+        [m["tenant_id"] for m in tenants],
+        tenant_id,
+        page_number,
+        items_per_page,
+        orderby,
+        desc,
+        id,
+        name,
+    )
+    renamed_list = []
+    for kb in kbs:
+        key_mapping = {
+            "chunk_num": "chunk_count",
+            "doc_num": "document_count",
+            "parser_id": "chunk_method",
+            "embd_id": "embedding_model",
+        }
+        renamed_data = {}
+        for key, value in kb.items():
+            new_key = key_mapping.get(key, key)
+            renamed_data[new_key] = value
+        renamed_list.append(renamed_data)
+    return get_result(data=renamed_list)
--- a/api/apps/sdk/dify_retrieval.py
+++ b/api/apps/sdk/dify_retrieval.py
@ -0,0 +1,88 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from flask import request, jsonify
+
+from api.db import LLMType
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import LLMBundle
+from api import settings
+from api.utils.api_utils import validate_request, build_error_result, apikey_required
+from rag.app.tag import label_question
+
+
+@manager.route('/dify/retrieval', methods=['POST'])  # noqa: F821
+@apikey_required
+@validate_request("knowledge_id", "query")
+def retrieval(tenant_id):
+    req = request.json
+    question = req["query"]
+    kb_id = req["knowledge_id"]
+    use_kg = req.get("use_kg", False)
+    retrieval_setting = req.get("retrieval_setting", {})
+    similarity_threshold = float(retrieval_setting.get("score_threshold", 0.0))
+    top = int(retrieval_setting.get("top_k", 1024))
+
+    try:
+
+        e, kb = KnowledgebaseService.get_by_id(kb_id)
+        if not e:
+            return build_error_result(message="Knowledgebase not found!", code=settings.RetCode.NOT_FOUND)
+
+        if kb.tenant_id != tenant_id:
+            return build_error_result(message="Knowledgebase not found!", code=settings.RetCode.NOT_FOUND)
+
+        embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING.value, llm_name=kb.embd_id)
+
+        ranks = settings.retrievaler.retrieval(
+            question,
+            embd_mdl,
+            kb.tenant_id,
+            [kb_id],
+            page=1,
+            page_size=top,
+            similarity_threshold=similarity_threshold,
+            vector_similarity_weight=0.3,
+            top=top,
+            rank_feature=label_question(question, [kb])
+        )
+
+        if use_kg:
+            ck = settings.kg_retrievaler.retrieval(question,
+                                                   [tenant_id],
+                                                   [kb_id],
+                                                   embd_mdl,
+                                                   LLMBundle(kb.tenant_id, LLMType.CHAT))
+            if ck["content_with_weight"]:
+                ranks["chunks"].insert(0, ck)
+
+        records = []
+        for c in ranks["chunks"]:
+            c.pop("vector", None)
+            records.append({
+                "content": c["content_with_weight"],
+                "score": c["similarity"],
+                "title": c["docnm_kwd"],
+                "metadata": {}
+            })
+
+        return jsonify({"records": records})
+    except Exception as e:
+        if str(e).find("not_found") > 0:
+            return build_error_result(
+                message='No chunk found! Check the chunk status please!',
+                code=settings.RetCode.NOT_FOUND
+            )
+        return build_error_result(message=str(e), code=settings.RetCode.SERVER_ERROR)
--- a/api/apps/sdk/doc.py
+++ b/api/apps/sdk/doc.py
--- a/api/apps/sdk/session.py
+++ b/api/apps/sdk/session.py
@ -0,0 +1,643 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import re
+import json
+import time
+
+from api.db import LLMType
+from api.db.services.conversation_service import ConversationService, iframe_completion
+from api.db.services.conversation_service import completion as rag_completion
+from api.db.services.canvas_service import completion as agent_completion
+from api.db.services.dialog_service import ask, chat
+from agent.canvas import Canvas
+from api.db import StatusEnum
+from api.db.db_models import APIToken
+from api.db.services.api_service import API4ConversationService
+from api.db.services.canvas_service import UserCanvasService
+from api.db.services.dialog_service import DialogService
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.utils import get_uuid
+from api.utils.api_utils import get_error_data_result, validate_request
+from api.utils.api_utils import get_result, token_required
+from api.db.services.llm_service import LLMBundle
+from api.db.services.file_service import FileService
+
+from flask import jsonify, request, Response
+
+@manager.route('/chats/<chat_id>/sessions', methods=['POST'])  # noqa: F821
+@token_required
+def create(tenant_id, chat_id):
+    req = request.json
+    req["dialog_id"] = chat_id
+    dia = DialogService.query(tenant_id=tenant_id, id=req["dialog_id"], status=StatusEnum.VALID.value)
+    if not dia:
+        return get_error_data_result(message="You do not own the assistant.")
+    conv = {
+        "id": get_uuid(),
+        "dialog_id": req["dialog_id"],
+        "name": req.get("name", "New session"),
+        "message": [{"role": "assistant", "content": dia[0].prompt_config.get("prologue")}],
+        "user_id": req.get("user_id", "")
+    }
+    if not conv.get("name"):
+        return get_error_data_result(message="`name` can not be empty.")
+    ConversationService.save(**conv)
+    e, conv = ConversationService.get_by_id(conv["id"])
+    if not e:
+        return get_error_data_result(message="Fail to create a session!")
+    conv = conv.to_dict()
+    conv['messages'] = conv.pop("message")
+    conv["chat_id"] = conv.pop("dialog_id")
+    del conv["reference"]
+    return get_result(data=conv)
+
+
+@manager.route('/agents/<agent_id>/sessions', methods=['POST'])  # noqa: F821
+@token_required
+def create_agent_session(tenant_id, agent_id):
+    req = request.json
+    if not request.is_json:
+        req = request.form
+    files = request.files
+    user_id = request.args.get('user_id', '')
+
+    e, cvs = UserCanvasService.get_by_id(agent_id)
+    if not e:
+        return get_error_data_result("Agent not found.")
+
+    if not UserCanvasService.query(user_id=tenant_id, id=agent_id):
+        return get_error_data_result("You cannot access the agent.")
+
+    if not isinstance(cvs.dsl, str):
+        cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
+
+    canvas = Canvas(cvs.dsl, tenant_id)
+    canvas.reset()
+    query = canvas.get_preset_param()
+    if query:
+        for ele in query:
+            if not ele["optional"]:
+                if ele["type"] == "file":
+                    if files is None or not files.get(ele["key"]):
+                        return get_error_data_result(f"`{ele['key']}` with type `{ele['type']}` is required")
+                    upload_file = files.get(ele["key"])
+                    file_content = FileService.parse_docs([upload_file], user_id)
+                    file_name = upload_file.filename
+                    ele["value"] = file_name + "\n" + file_content
+                else:
+                    if req is None or not req.get(ele["key"]):
+                        return get_error_data_result(f"`{ele['key']}` with type `{ele['type']}` is required")
+                    ele["value"] = req[ele["key"]]
+            else:
+                if ele["type"] == "file":
+                    if files is not None and files.get(ele["key"]):
+                        upload_file = files.get(ele["key"])
+                        file_content = FileService.parse_docs([upload_file], user_id)
+                        file_name = upload_file.filename
+                        ele["value"] = file_name + "\n" + file_content
+                    else:
+                        if "value" in ele:
+                            ele.pop("value")
+                else:
+                    if req is not None and req.get(ele["key"]):
+                        ele["value"] = req[ele['key']]
+                    else:
+                        if "value" in ele:
+                            ele.pop("value")
+    else:
+        for ans in canvas.run(stream=False):
+            pass
+    cvs.dsl = json.loads(str(canvas))
+    conv = {
+        "id": get_uuid(),
+        "dialog_id": cvs.id,
+        "user_id": user_id,
+        "message": [{"role": "assistant", "content": canvas.get_prologue()}],
+        "source": "agent",
+        "dsl": cvs.dsl
+    }
+    API4ConversationService.save(**conv)
+    conv["agent_id"] = conv.pop("dialog_id")
+    return get_result(data=conv)
+
+
+@manager.route('/chats/<chat_id>/sessions/<session_id>', methods=['PUT'])  # noqa: F821
+@token_required
+def update(tenant_id, chat_id, session_id):
+    req = request.json
+    req["dialog_id"] = chat_id
+    conv_id = session_id
+    conv = ConversationService.query(id=conv_id, dialog_id=chat_id)
+    if not conv:
+        return get_error_data_result(message="Session does not exist")
+    if not DialogService.query(id=chat_id, tenant_id=tenant_id, status=StatusEnum.VALID.value):
+        return get_error_data_result(message="You do not own the session")
+    if "message" in req or "messages" in req:
+        return get_error_data_result(message="`message` can not be change")
+    if "reference" in req:
+        return get_error_data_result(message="`reference` can not be change")
+    if "name" in req and not req.get("name"):
+        return get_error_data_result(message="`name` can not be empty.")
+    if not ConversationService.update_by_id(conv_id, req):
+        return get_error_data_result(message="Session updates error")
+    return get_result()
+
+
+@manager.route('/chats/<chat_id>/completions', methods=['POST'])  # noqa: F821
+@token_required
+def chat_completion(tenant_id, chat_id):
+    req = request.json
+    if not req:
+        req = {"question": ""}
+    if not req.get("session_id"):
+        req["question"]=""
+    if not DialogService.query(tenant_id=tenant_id, id=chat_id, status=StatusEnum.VALID.value):
+        return get_error_data_result(f"You don't own the chat {chat_id}")
+    if req.get("session_id"):
+        if not ConversationService.query(id=req["session_id"], dialog_id=chat_id):
+            return get_error_data_result(f"You don't own the session {req['session_id']}")
+    if req.get("stream", True):
+        resp = Response(rag_completion(tenant_id, chat_id, **req), mimetype="text/event-stream")
+        resp.headers.add_header("Cache-control", "no-cache")
+        resp.headers.add_header("Connection", "keep-alive")
+        resp.headers.add_header("X-Accel-Buffering", "no")
+        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+
+        return resp
+    else:
+        answer = None
+        for ans in rag_completion(tenant_id, chat_id, **req):
+            answer = ans
+            break
+        return get_result(data=answer)
+
+
+@manager.route('chats_openai/<chat_id>/chat/completions', methods=['POST'])  # noqa: F821
+@validate_request("model", "messages")  # noqa: F821
+@token_required
+def chat_completion_openai_like(tenant_id, chat_id):
+    """
+    OpenAI-like chat completion API that simulates the behavior of OpenAI's completions endpoint.
+    
+    This function allows users to interact with a model and receive responses based on a series of historical messages.
+    If `stream` is set to True (by default), the response will be streamed in chunks, mimicking the OpenAI-style API.
+    Set `stream` to False explicitly, the response will be returned in a single complete answer.
+    Example usage:
+
+    curl -X POST https://ragflow_address.com/api/v1/chats_openai/<chat_id>/chat/completions \
+        -H "Content-Type: application/json" \
+        -H "Authorization: Bearer $RAGFLOW_API_KEY" \
+        -d '{
+            "model": "model",
+            "messages": [{"role": "user", "content": "Say this is a test!"}],
+            "stream": true
+        }'
+
+    Alternatively, you can use Python's `OpenAI` client:
+
+    from openai import OpenAI
+
+    model = "model"
+    client = OpenAI(api_key="ragflow-api-key", base_url=f"http://ragflow_address/api/v1/chats_openai/<chat_id>")
+    
+    completion = client.chat.completions.create(
+        model=model,
+        messages=[
+            {"role": "system", "content": "You are a helpful assistant."},
+            {"role": "user", "content": "Who are you?"},
+            {"role": "assistant", "content": "I am an AI assistant named..."},
+            {"role": "user", "content": "Can you tell me how to install neovim"},
+        ],
+        stream=True
+    )
+    
+    stream = True
+    if stream:
+        for chunk in completion:
+            print(chunk)
+    else:
+        print(completion.choices[0].message.content)
+    """
+    req = request.json
+
+    messages = req.get("messages", [])
+    # To prevent empty [] input
+    if len(messages) < 1:
+        return get_error_data_result("You have to provide messages.")
+    if messages[-1]["role"] != "user":
+        return get_error_data_result("The last content of this conversation is not from user.")
+
+    prompt = messages[-1]["content"]
+    # Treat context tokens as reasoning tokens
+    context_token_used = sum(len(message["content"]) for message in messages)
+
+    dia = DialogService.query(tenant_id=tenant_id, id=chat_id, status=StatusEnum.VALID.value)
+    if not dia:
+        return get_error_data_result(f"You don't own the chat {chat_id}")
+    dia = dia[0]
+
+    # Filter system and non-sense assistant messages
+    msg = None
+    msg = [m for m in messages if m["role"] != "system" and (m["role"] != "assistant" or msg)]
+
+    if req.get("stream", True):
+        # The value for the usage field on all chunks except for the last one will be null.
+        # The usage field on the last chunk contains token usage statistics for the entire request.
+        # The choices field on the last chunk will always be an empty array [].
+        def streamed_response_generator(chat_id, dia, msg):
+            token_used = 0
+            response = {
+                "id": f"chatcmpl-{chat_id}",
+                "choices": [
+                    {
+                        "delta": {
+                            "content": "",
+                            "role": "assistant",
+                            "function_call": None,
+                            "tool_calls": None
+                        },
+                        "finish_reason": None,
+                        "index": 0,
+                        "logprobs": None
+                    }
+                ],
+                "created": int(time.time()),
+                "model": "model",
+                "object": "chat.completion.chunk",
+                "system_fingerprint": "",
+                "usage": None
+            }
+
+            try:
+                for ans in chat(dia, msg, True):
+                    answer = ans["answer"]
+                    incremental = answer[token_used:]
+                    token_used += len(incremental)
+                    response["choices"][0]["delta"]["content"] = incremental
+                    yield f"data:{json.dumps(response, ensure_ascii=False)}\n\n"
+            except Exception as e:
+                response["choices"][0]["delta"]["content"] = "**ERROR**: " + str(e)
+                yield f"data:{json.dumps(response, ensure_ascii=False)}\n\n"
+
+            # The last chunk
+            response["choices"][0]["delta"]["content"] = None
+            response["choices"][0]["finish_reason"] = "stop"
+            response["usage"] = {
+                "prompt_tokens": len(prompt),
+                "completion_tokens": token_used,
+                "total_tokens": len(prompt) + token_used
+            }
+            yield f"data:{json.dumps(response, ensure_ascii=False)}\n\n"
+            yield "data:[DONE]\n\n"
+
+
+        resp = Response(streamed_response_generator(chat_id, dia, msg), mimetype="text/event-stream")
+        resp.headers.add_header("Cache-control", "no-cache")
+        resp.headers.add_header("Connection", "keep-alive")
+        resp.headers.add_header("X-Accel-Buffering", "no")
+        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+        return resp
+    else:
+        answer = None
+        for ans in chat(dia, msg, False):
+            # focus answer content only
+            answer = ans
+            break
+        content = answer["answer"]
+
+        response  = {
+            "id": f"chatcmpl-{chat_id}",
+            "object": "chat.completion",
+            "created": int(time.time()),
+            "model": req.get("model", ""),
+            "usage": {
+                "prompt_tokens": len(prompt),
+                "completion_tokens": len(content),
+                "total_tokens": len(prompt) + len(content),
+                "completion_tokens_details": {
+                    "reasoning_tokens": context_token_used,
+                    "accepted_prediction_tokens": len(content),
+                    "rejected_prediction_tokens": 0 # 0 for simplicity
+                }
+            },
+            "choices": [
+                {
+                    "message": {
+                        "role": "assistant",
+                        "content": content
+                    },
+                    "logprobs": None,
+                    "finish_reason": "stop",
+                    "index": 0
+                }
+            ]
+        }
+        return jsonify(response)
+
+
+@manager.route('/agents/<agent_id>/completions', methods=['POST'])  # noqa: F821
+@token_required
+def agent_completions(tenant_id, agent_id):
+    req = request.json
+    cvs = UserCanvasService.query(user_id=tenant_id, id=agent_id)
+    if not cvs:
+        return get_error_data_result(f"You don't own the agent {agent_id}")
+    if req.get("session_id"):
+        dsl = cvs[0].dsl
+        if not isinstance(dsl, str):
+            dsl = json.dumps(dsl)
+        #canvas = Canvas(dsl, tenant_id)
+        #if canvas.get_preset_param():
+        #    req["question"] = ""
+        conv = API4ConversationService.query(id=req["session_id"], dialog_id=agent_id)
+        if not conv:
+            return get_error_data_result(f"You don't own the session {req['session_id']}")
+    else:
+        req["question"] = ""
+    if req.get("stream", True):
+        resp = Response(agent_completion(tenant_id, agent_id, **req), mimetype="text/event-stream")
+        resp.headers.add_header("Cache-control", "no-cache")
+        resp.headers.add_header("Connection", "keep-alive")
+        resp.headers.add_header("X-Accel-Buffering", "no")
+        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+        return resp
+    try:
+        for answer in agent_completion(tenant_id, agent_id, **req):
+            return get_result(data=answer)
+    except Exception as e:
+        return get_error_data_result(str(e))
+
+
+@manager.route('/chats/<chat_id>/sessions', methods=['GET'])  # noqa: F821
+@token_required
+def list_session(tenant_id, chat_id):
+    if not DialogService.query(tenant_id=tenant_id, id=chat_id, status=StatusEnum.VALID.value):
+        return get_error_data_result(message=f"You don't own the assistant {chat_id}.")
+    id = request.args.get("id")
+    name = request.args.get("name")
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 30))
+    orderby = request.args.get("orderby", "create_time")
+    user_id = request.args.get("user_id")
+    if request.args.get("desc") == "False" or request.args.get("desc") == "false":
+        desc = False
+    else:
+        desc = True
+    convs = ConversationService.get_list(chat_id, page_number, items_per_page, orderby, desc, id, name, user_id)
+    if not convs:
+        return get_result(data=[])
+    for conv in convs:
+        conv['messages'] = conv.pop("message")
+        infos = conv["messages"]
+        for info in infos:
+            if "prompt" in info:
+                info.pop("prompt")
+        conv["chat_id"] = conv.pop("dialog_id")
+        if conv["reference"]:
+            messages = conv["messages"]
+            message_num = 0
+            chunk_num = 0
+            while message_num < len(messages):
+                if message_num != 0 and messages[message_num]["role"] != "user":
+                    chunk_list = []
+                    if "chunks" in conv["reference"][chunk_num]:
+                        chunks = conv["reference"][chunk_num]["chunks"]
+                        for chunk in chunks:
+                            new_chunk = {
+                                "id": chunk.get("chunk_id", chunk.get("id")),
+                                "content": chunk.get("content_with_weight", chunk.get("content")),
+                                "document_id": chunk.get("doc_id", chunk.get("document_id")),
+                                "document_name": chunk.get("docnm_kwd", chunk.get("document_name")),
+                                "dataset_id": chunk.get("kb_id", chunk.get("dataset_id")),
+                                "image_id": chunk.get("image_id", chunk.get("img_id")),
+                                "positions": chunk.get("positions", chunk.get("position_int")),
+                            }
+
+                            chunk_list.append(new_chunk)
+                    chunk_num += 1
+                    messages[message_num]["reference"] = chunk_list
+                message_num += 1
+        del conv["reference"]
+    return get_result(data=convs)
+
+
+@manager.route('/agents/<agent_id>/sessions', methods=['GET'])  # noqa: F821
+@token_required
+def list_agent_session(tenant_id, agent_id):
+    if not UserCanvasService.query(user_id=tenant_id, id=agent_id):
+        return get_error_data_result(message=f"You don't own the agent {agent_id}.")
+    id = request.args.get("id")
+    user_id = request.args.get("user_id")
+    page_number = int(request.args.get("page", 1))
+    items_per_page = int(request.args.get("page_size", 30))
+    orderby = request.args.get("orderby", "update_time")
+    if request.args.get("desc") == "False" or request.args.get("desc") == "false":
+        desc = False
+    else:
+        desc = True
+    convs = API4ConversationService.get_list(agent_id, tenant_id, page_number, items_per_page, orderby, desc, id, user_id)
+    if not convs:
+        return get_result(data=[])
+    for conv in convs:
+        conv['messages'] = conv.pop("message")
+        infos = conv["messages"]
+        for info in infos:
+            if "prompt" in info:
+                info.pop("prompt")
+        conv["agent_id"] = conv.pop("dialog_id")
+        if conv["reference"]:
+            messages = conv["messages"]
+            message_num = 0
+            chunk_num = 0
+            while message_num < len(messages):
+                if message_num != 0 and messages[message_num]["role"] != "user":
+                    chunk_list = []
+                    if "chunks" in conv["reference"][chunk_num]:
+                        chunks = conv["reference"][chunk_num]["chunks"]
+                        for chunk in chunks:
+                            new_chunk = {
+                                "id": chunk.get("chunk_id", chunk.get("id")),
+                                "content": chunk.get("content_with_weight", chunk.get("content")),
+                                "document_id": chunk.get("doc_id", chunk.get("document_id")),
+                                "document_name": chunk.get("docnm_kwd", chunk.get("document_name")),
+                                "dataset_id": chunk.get("kb_id", chunk.get("dataset_id")),
+                                "image_id": chunk.get("image_id", chunk.get("img_id")),
+                                "positions": chunk.get("positions", chunk.get("position_int")),
+                            }
+                            chunk_list.append(new_chunk)
+                    chunk_num += 1
+                    messages[message_num]["reference"] = chunk_list
+                message_num += 1
+        del conv["reference"]
+    return get_result(data=convs)
+
+
+@manager.route('/chats/<chat_id>/sessions', methods=["DELETE"])  # noqa: F821
+@token_required
+def delete(tenant_id, chat_id):
+    if not DialogService.query(id=chat_id, tenant_id=tenant_id, status=StatusEnum.VALID.value):
+        return get_error_data_result(message="You don't own the chat")
+    req = request.json
+    convs = ConversationService.query(dialog_id=chat_id)
+    if not req:
+        ids = None
+    else:
+        ids = req.get("ids")
+
+    if not ids:
+        conv_list = []
+        for conv in convs:
+            conv_list.append(conv.id)
+    else:
+        conv_list = ids
+    for id in conv_list:
+        conv = ConversationService.query(id=id, dialog_id=chat_id)
+        if not conv:
+            return get_error_data_result(message="The chat doesn't own the session")
+        ConversationService.delete_by_id(id)
+    return get_result()
+
+
+@manager.route('/sessions/ask', methods=['POST'])  # noqa: F821
+@token_required
+def ask_about(tenant_id):
+    req = request.json
+    if not req.get("question"):
+        return get_error_data_result("`question` is required.")
+    if not req.get("dataset_ids"):
+        return get_error_data_result("`dataset_ids` is required.")
+    if not isinstance(req.get("dataset_ids"), list):
+        return get_error_data_result("`dataset_ids` should be a list.")
+    req["kb_ids"] = req.pop("dataset_ids")
+    for kb_id in req["kb_ids"]:
+        if not KnowledgebaseService.accessible(kb_id, tenant_id):
+            return get_error_data_result(f"You don't own the dataset {kb_id}.")
+        kbs = KnowledgebaseService.query(id=kb_id)
+        kb = kbs[0]
+        if kb.chunk_num == 0:
+            return get_error_data_result(f"The dataset {kb_id} doesn't own parsed file")
+    uid = tenant_id
+
+    def stream():
+        nonlocal req, uid
+        try:
+            for ans in ask(req["question"], req["kb_ids"], uid):
+                yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
+        except Exception as e:
+            yield "data:" + json.dumps({"code": 500, "message": str(e),
+                                        "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
+                                       ensure_ascii=False) + "\n\n"
+        yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
+
+    resp = Response(stream(), mimetype="text/event-stream")
+    resp.headers.add_header("Cache-control", "no-cache")
+    resp.headers.add_header("Connection", "keep-alive")
+    resp.headers.add_header("X-Accel-Buffering", "no")
+    resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+    return resp
+
+
+@manager.route('/sessions/related_questions', methods=['POST'])  # noqa: F821
+@token_required
+def related_questions(tenant_id):
+    req = request.json
+    if not req.get("question"):
+        return get_error_data_result("`question` is required.")
+    question = req["question"]
+    chat_mdl = LLMBundle(tenant_id, LLMType.CHAT)
+    prompt = """
+Objective: To generate search terms related to the user's search keywords, helping users find more valuable information.
+Instructions:
+ - Based on the keywords provided by the user, generate 5-10 related search terms.
+ - Each search term should be directly or indirectly related to the keyword, guiding the user to find more valuable information.
+ - Use common, general terms as much as possible, avoiding obscure words or technical jargon.
+ - Keep the term length between 2-4 words, concise and clear.
+ - DO NOT translate, use the language of the original keywords.
+
+### Example:
+Keywords: Chinese football
+Related search terms:
+1. Current status of Chinese football
+2. Reform of Chinese football
+3. Youth training of Chinese football
+4. Chinese football in the Asian Cup
+5. Chinese football in the World Cup
+
+Reason:
+ - When searching, users often only use one or two keywords, making it difficult to fully express their information needs.
+ - Generating related search terms can help users dig deeper into relevant information and improve search efficiency. 
+ - At the same time, related terms can also help search engines better understand user needs and return more accurate search results.
+
+"""
+    ans = chat_mdl.chat(prompt, [{"role": "user", "content": f"""
+Keywords: {question}
+Related search terms:
+    """}], {"temperature": 0.9})
+    return get_result(data=[re.sub(r"^[0-9]\. ", "", a) for a in ans.split("\n") if re.match(r"^[0-9]\. ", a)])
+
+
+@manager.route('/chatbots/<dialog_id>/completions', methods=['POST'])  # noqa: F821
+def chatbot_completions(dialog_id):
+    req = request.json
+
+    token = request.headers.get('Authorization').split()
+    if len(token) != 2:
+        return get_error_data_result(message='Authorization is not valid!"')
+    token = token[1]
+    objs = APIToken.query(beta=token)
+    if not objs:
+        return get_error_data_result(message='Authentication error: API key is invalid!"')
+
+    if "quote" not in req:
+        req["quote"] = False
+
+    if req.get("stream", True):
+        resp = Response(iframe_completion(dialog_id, **req), mimetype="text/event-stream")
+        resp.headers.add_header("Cache-control", "no-cache")
+        resp.headers.add_header("Connection", "keep-alive")
+        resp.headers.add_header("X-Accel-Buffering", "no")
+        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+        return resp
+
+    for answer in iframe_completion(dialog_id, **req):
+        return get_result(data=answer)
+
+
+@manager.route('/agentbots/<agent_id>/completions', methods=['POST'])  # noqa: F821
+def agent_bot_completions(agent_id):
+    req = request.json
+
+    token = request.headers.get('Authorization').split()
+    if len(token) != 2:
+        return get_error_data_result(message='Authorization is not valid!"')
+    token = token[1]
+    objs = APIToken.query(beta=token)
+    if not objs:
+        return get_error_data_result(message='Authentication error: API key is invalid!"')
+
+    if "quote" not in req:
+        req["quote"] = False
+
+    if req.get("stream", True):
+        resp = Response(agent_completion(objs[0].tenant_id, agent_id, **req), mimetype="text/event-stream")
+        resp.headers.add_header("Cache-control", "no-cache")
+        resp.headers.add_header("Connection", "keep-alive")
+        resp.headers.add_header("X-Accel-Buffering", "no")
+        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
+        return resp
+
+    for answer in agent_completion(objs[0].tenant_id, agent_id, **req):
+        return get_result(data=answer)
--- a/api/apps/system_app.py
+++ b/api/apps/system_app.py
@ -0,0 +1,300 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License
+#
+import logging
+from datetime import datetime
+import json
+
+from flask_login import login_required, current_user
+
+from api.db.db_models import APIToken
+from api.db.services.api_service import APITokenService
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.user_service import UserTenantService
+from api import settings
+from api.utils import current_timestamp, datetime_format
+from api.utils.api_utils import (
+    get_json_result,
+    get_data_error_result,
+    server_error_response,
+    generate_confirmation_token,
+)
+from api.versions import get_ragflow_version
+from rag.utils.storage_factory import STORAGE_IMPL, STORAGE_IMPL_TYPE
+from timeit import default_timer as timer
+
+from rag.utils.redis_conn import REDIS_CONN
+
+
+@manager.route("/version", methods=["GET"])  # noqa: F821
+@login_required
+def version():
+    """
+    Get the current version of the application.
+    ---
+    tags:
+      - System
+    security:
+      - ApiKeyAuth: []
+    responses:
+      200:
+        description: Version retrieved successfully.
+        schema:
+          type: object
+          properties:
+            version:
+              type: string
+              description: Version number.
+    """
+    return get_json_result(data=get_ragflow_version())
+
+
+@manager.route("/status", methods=["GET"])  # noqa: F821
+@login_required
+def status():
+    """
+    Get the system status.
+    ---
+    tags:
+      - System
+    security:
+      - ApiKeyAuth: []
+    responses:
+      200:
+        description: System is operational.
+        schema:
+          type: object
+          properties:
+            es:
+              type: object
+              description: Elasticsearch status.
+            storage:
+              type: object
+              description: Storage status.
+            database:
+              type: object
+              description: Database status.
+      503:
+        description: Service unavailable.
+        schema:
+          type: object
+          properties:
+            error:
+              type: string
+              description: Error message.
+    """
+    res = {}
+    st = timer()
+    try:
+        res["doc_engine"] = settings.docStoreConn.health()
+        res["doc_engine"]["elapsed"] = "{:.1f}".format((timer() - st) * 1000.0)
+    except Exception as e:
+        res["doc_engine"] = {
+            "type": "unknown",
+            "status": "red",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+            "error": str(e),
+        }
+
+    st = timer()
+    try:
+        STORAGE_IMPL.health()
+        res["storage"] = {
+            "storage": STORAGE_IMPL_TYPE.lower(),
+            "status": "green",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+        }
+    except Exception as e:
+        res["storage"] = {
+            "storage": STORAGE_IMPL_TYPE.lower(),
+            "status": "red",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+            "error": str(e),
+        }
+
+    st = timer()
+    try:
+        KnowledgebaseService.get_by_id("x")
+        res["database"] = {
+            "database": settings.DATABASE_TYPE.lower(),
+            "status": "green",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+        }
+    except Exception as e:
+        res["database"] = {
+            "database": settings.DATABASE_TYPE.lower(),
+            "status": "red",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+            "error": str(e),
+        }
+
+    st = timer()
+    try:
+        if not REDIS_CONN.health():
+            raise Exception("Lost connection!")
+        res["redis"] = {
+            "status": "green",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+        }
+    except Exception as e:
+        res["redis"] = {
+            "status": "red",
+            "elapsed": "{:.1f}".format((timer() - st) * 1000.0),
+            "error": str(e),
+        }
+
+    task_executor_heartbeats = {}
+    try:
+        task_executors = REDIS_CONN.smembers("TASKEXE")
+        now = datetime.now().timestamp()
+        for task_executor_id in task_executors:
+            heartbeats = REDIS_CONN.zrangebyscore(task_executor_id, now - 60*30, now)
+            heartbeats = [json.loads(heartbeat) for heartbeat in heartbeats]
+            task_executor_heartbeats[task_executor_id] = heartbeats
+    except Exception:
+        logging.exception("get task executor heartbeats failed!")
+    res["task_executor_heartbeats"] = task_executor_heartbeats
+
+    return get_json_result(data=res)
+
+
+@manager.route("/new_token", methods=["POST"])  # noqa: F821
+@login_required
+def new_token():
+    """
+    Generate a new API token.
+    ---
+    tags:
+      - API Tokens
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: query
+        name: name
+        type: string
+        required: false
+        description: Name of the token.
+    responses:
+      200:
+        description: Token generated successfully.
+        schema:
+          type: object
+          properties:
+            token:
+              type: string
+              description: The generated API token.
+    """
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+
+        tenant_id = tenants[0].tenant_id
+        obj = {
+            "tenant_id": tenant_id,
+            "token": generate_confirmation_token(tenant_id),
+            "beta": generate_confirmation_token(generate_confirmation_token(tenant_id)).replace("ragflow-", "")[:32],
+            "create_time": current_timestamp(),
+            "create_date": datetime_format(datetime.now()),
+            "update_time": None,
+            "update_date": None,
+        }
+
+        if not APITokenService.save(**obj):
+            return get_data_error_result(message="Fail to new a dialog!")
+
+        return get_json_result(data=obj)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route("/token_list", methods=["GET"])  # noqa: F821
+@login_required
+def token_list():
+    """
+    List all API tokens for the current user.
+    ---
+    tags:
+      - API Tokens
+    security:
+      - ApiKeyAuth: []
+    responses:
+      200:
+        description: List of API tokens.
+        schema:
+          type: object
+          properties:
+            tokens:
+              type: array
+              items:
+                type: object
+                properties:
+                  token:
+                    type: string
+                    description: The API token.
+                  name:
+                    type: string
+                    description: Name of the token.
+                  create_time:
+                    type: string
+                    description: Token creation time.
+    """
+    try:
+        tenants = UserTenantService.query(user_id=current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+
+        tenant_id = tenants[0].tenant_id
+        objs = APITokenService.query(tenant_id=tenant_id)
+        objs = [o.to_dict() for o in objs]
+        for o in objs:
+            if not o["beta"]:
+                o["beta"] = generate_confirmation_token(generate_confirmation_token(tenants[0].tenant_id)).replace("ragflow-", "")[:32]
+                APITokenService.filter_update([APIToken.tenant_id == tenant_id, APIToken.token == o["token"]], o)
+        return get_json_result(data=objs)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route("/token/<token>", methods=["DELETE"])  # noqa: F821
+@login_required
+def rm(token):
+    """
+    Remove an API token.
+    ---
+    tags:
+      - API Tokens
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: path
+        name: token
+        type: string
+        required: true
+        description: The API token to remove.
+    responses:
+      200:
+        description: Token removed successfully.
+        schema:
+          type: object
+          properties:
+            success:
+              type: boolean
+              description: Deletion status.
+    """
+    APITokenService.filter_delete(
+        [APIToken.tenant_id == current_user.id, APIToken.token == token]
+    )
+    return get_json_result(data=True)
--- a/api/apps/tenant_app.py
+++ b/api/apps/tenant_app.py
@ -0,0 +1,122 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from flask import request
+from flask_login import login_required, current_user
+
+from api import settings
+from api.db import UserTenantRole, StatusEnum
+from api.db.db_models import UserTenant
+from api.db.services.user_service import UserTenantService, UserService
+
+from api.utils import get_uuid, delta_seconds
+from api.utils.api_utils import get_json_result, validate_request, server_error_response, get_data_error_result
+
+
+@manager.route("/<tenant_id>/user/list", methods=["GET"])  # noqa: F821
+@login_required
+def user_list(tenant_id):
+    if current_user.id != tenant_id:
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    try:
+        users = UserTenantService.get_by_tenant_id(tenant_id)
+        for u in users:
+            u["delta_seconds"] = delta_seconds(str(u["update_date"]))
+        return get_json_result(data=users)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route('/<tenant_id>/user', methods=['POST'])  # noqa: F821
+@login_required
+@validate_request("email")
+def create(tenant_id):
+    if current_user.id != tenant_id:
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    req = request.json
+    invite_user_email = req["email"]
+    invite_users = UserService.query(email=invite_user_email)
+    if not invite_users:
+        return get_data_error_result(message="User not found.")
+
+    user_id_to_invite = invite_users[0].id
+    user_tenants = UserTenantService.query(user_id=user_id_to_invite, tenant_id=tenant_id)
+    if user_tenants:
+        user_tenant_role = user_tenants[0].role
+        if user_tenant_role == UserTenantRole.NORMAL:
+            return get_data_error_result(message=f"{invite_user_email} is already in the team.")
+        if user_tenant_role == UserTenantRole.OWNER:
+            return get_data_error_result(message=f"{invite_user_email} is the owner of the team.")
+        return get_data_error_result(message=f"{invite_user_email} is in the team, but the role: {user_tenant_role} is invalid.")
+
+    UserTenantService.save(
+        id=get_uuid(),
+        user_id=user_id_to_invite,
+        tenant_id=tenant_id,
+        invited_by=current_user.id,
+        role=UserTenantRole.INVITE,
+        status=StatusEnum.VALID.value)
+
+    usr = invite_users[0].to_dict()
+    usr = {k: v for k, v in usr.items() if k in ["id", "avatar", "email", "nickname"]}
+
+    return get_json_result(data=usr)
+
+
+@manager.route('/<tenant_id>/user/<user_id>', methods=['DELETE'])  # noqa: F821
+@login_required
+def rm(tenant_id, user_id):
+    if current_user.id != tenant_id and current_user.id != user_id:
+        return get_json_result(
+            data=False,
+            message='No authorization.',
+            code=settings.RetCode.AUTHENTICATION_ERROR)
+
+    try:
+        UserTenantService.filter_delete([UserTenant.tenant_id == tenant_id, UserTenant.user_id == user_id])
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route("/list", methods=["GET"])  # noqa: F821
+@login_required
+def tenant_list():
+    try:
+        users = UserTenantService.get_tenants_by_user_id(current_user.id)
+        for u in users:
+            u["delta_seconds"] = delta_seconds(str(u["update_date"]))
+        return get_json_result(data=users)
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route("/agree/<tenant_id>", methods=["PUT"])  # noqa: F821
+@login_required
+def agree(tenant_id):
+    try:
+        UserTenantService.filter_update([UserTenant.tenant_id == tenant_id, UserTenant.user_id == current_user.id], {"role": UserTenantRole.NORMAL})
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
--- a/api/apps/user_app.py
+++ b/api/apps/user_app.py
@ -0,0 +1,704 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+import json
+import re
+from datetime import datetime
+
+from flask import request, session, redirect
+from werkzeug.security import generate_password_hash, check_password_hash
+from flask_login import login_required, current_user, login_user, logout_user
+
+from api.db.db_models import TenantLLM
+from api.db.services.llm_service import TenantLLMService, LLMService
+from api.utils.api_utils import (
+    server_error_response,
+    validate_request,
+    get_data_error_result,
+)
+from api.utils import (
+    get_uuid,
+    get_format_time,
+    decrypt,
+    download_img,
+    current_timestamp,
+    datetime_format,
+)
+from api.db import UserTenantRole, FileType
+from api import settings
+from api.db.services.user_service import UserService, TenantService, UserTenantService
+from api.db.services.file_service import FileService
+from api.utils.api_utils import get_json_result, construct_response
+
+
+@manager.route("/login", methods=["POST", "GET"])  # noqa: F821
+def login():
+    """
+    User login endpoint.
+    ---
+    tags:
+      - User
+    parameters:
+      - in: body
+        name: body
+        description: Login credentials.
+        required: true
+        schema:
+          type: object
+          properties:
+            email:
+              type: string
+              description: User email.
+            password:
+              type: string
+              description: User password.
+    responses:
+      200:
+        description: Login successful.
+        schema:
+          type: object
+      401:
+        description: Authentication failed.
+        schema:
+          type: object
+    """
+    if not request.json:
+        return get_json_result(
+            data=False, code=settings.RetCode.AUTHENTICATION_ERROR, message="Unauthorized!"
+        )
+
+    email = request.json.get("email", "")
+    users = UserService.query(email=email)
+    if not users:
+        return get_json_result(
+            data=False,
+            code=settings.RetCode.AUTHENTICATION_ERROR,
+            message=f"Email: {email} is not registered!",
+        )
+
+    password = request.json.get("password")
+    try:
+        password = decrypt(password)
+    except BaseException:
+        return get_json_result(
+            data=False, code=settings.RetCode.SERVER_ERROR, message="Fail to crypt password"
+        )
+
+    user = UserService.query_user(email, password)
+    if user:
+        response_data = user.to_json()
+        user.access_token = get_uuid()
+        login_user(user)
+        user.update_time = (current_timestamp(),)
+        user.update_date = (datetime_format(datetime.now()),)
+        user.save()
+        msg = "Welcome back!"
+        return construct_response(data=response_data, auth=user.get_id(), message=msg)
+    else:
+        return get_json_result(
+            data=False,
+            code=settings.RetCode.AUTHENTICATION_ERROR,
+            message="Email and password do not match!",
+        )
+
+
+@manager.route("/github_callback", methods=["GET"])  # noqa: F821
+def github_callback():
+    """
+    GitHub OAuth callback endpoint.
+    ---
+    tags:
+      - OAuth
+    parameters:
+      - in: query
+        name: code
+        type: string
+        required: true
+        description: Authorization code from GitHub.
+    responses:
+      200:
+        description: Authentication successful.
+        schema:
+          type: object
+    """
+    import requests
+
+    res = requests.post(
+        settings.GITHUB_OAUTH.get("url"),
+        data={
+            "client_id": settings.GITHUB_OAUTH.get("client_id"),
+            "client_secret": settings.GITHUB_OAUTH.get("secret_key"),
+            "code": request.args.get("code"),
+        },
+        headers={"Accept": "application/json"},
+    )
+    res = res.json()
+    if "error" in res:
+        return redirect("/?error=%s" % res["error_description"])
+
+    if "user:email" not in res["scope"].split(","):
+        return redirect("/?error=user:email not in scope")
+
+    session["access_token"] = res["access_token"]
+    session["access_token_from"] = "github"
+    user_info = user_info_from_github(session["access_token"])
+    email_address = user_info["email"]
+    users = UserService.query(email=email_address)
+    user_id = get_uuid()
+    if not users:
+        # User isn't try to register
+        try:
+            try:
+                avatar = download_img(user_info["avatar_url"])
+            except Exception as e:
+                logging.exception(e)
+                avatar = ""
+            users = user_register(
+                user_id,
+                {
+                    "access_token": session["access_token"],
+                    "email": email_address,
+                    "avatar": avatar,
+                    "nickname": user_info["login"],
+                    "login_channel": "github",
+                    "last_login_time": get_format_time(),
+                    "is_superuser": False,
+                },
+            )
+            if not users:
+                raise Exception(f"Fail to register {email_address}.")
+            if len(users) > 1:
+                raise Exception(f"Same email: {email_address} exists!")
+
+            # Try to log in
+            user = users[0]
+            login_user(user)
+            return redirect("/?auth=%s" % user.get_id())
+        except Exception as e:
+            rollback_user_registration(user_id)
+            logging.exception(e)
+            return redirect("/?error=%s" % str(e))
+
+    # User has already registered, try to log in
+    user = users[0]
+    user.access_token = get_uuid()
+    login_user(user)
+    user.save()
+    return redirect("/?auth=%s" % user.get_id())
+
+
+@manager.route("/feishu_callback", methods=["GET"])  # noqa: F821
+def feishu_callback():
+    """
+    Feishu OAuth callback endpoint.
+    ---
+    tags:
+      - OAuth
+    parameters:
+      - in: query
+        name: code
+        type: string
+        required: true
+        description: Authorization code from Feishu.
+    responses:
+      200:
+        description: Authentication successful.
+        schema:
+          type: object
+    """
+    import requests
+
+    app_access_token_res = requests.post(
+        settings.FEISHU_OAUTH.get("app_access_token_url"),
+        data=json.dumps(
+            {
+                "app_id": settings.FEISHU_OAUTH.get("app_id"),
+                "app_secret": settings.FEISHU_OAUTH.get("app_secret"),
+            }
+        ),
+        headers={"Content-Type": "application/json; charset=utf-8"},
+    )
+    app_access_token_res = app_access_token_res.json()
+    if app_access_token_res["code"] != 0:
+        return redirect("/?error=%s" % app_access_token_res)
+
+    res = requests.post(
+        settings.FEISHU_OAUTH.get("user_access_token_url"),
+        data=json.dumps(
+            {
+                "grant_type": settings.FEISHU_OAUTH.get("grant_type"),
+                "code": request.args.get("code"),
+            }
+        ),
+        headers={
+            "Content-Type": "application/json; charset=utf-8",
+            "Authorization": f"Bearer {app_access_token_res['app_access_token']}",
+        },
+    )
+    res = res.json()
+    if res["code"] != 0:
+        return redirect("/?error=%s" % res["message"])
+
+    if "contact:user.email:readonly" not in res["data"]["scope"].split():
+        return redirect("/?error=contact:user.email:readonly not in scope")
+    session["access_token"] = res["data"]["access_token"]
+    session["access_token_from"] = "feishu"
+    user_info = user_info_from_feishu(session["access_token"])
+    email_address = user_info["email"]
+    users = UserService.query(email=email_address)
+    user_id = get_uuid()
+    if not users:
+        # User isn't try to register
+        try:
+            try:
+                avatar = download_img(user_info["avatar_url"])
+            except Exception as e:
+                logging.exception(e)
+                avatar = ""
+            users = user_register(
+                user_id,
+                {
+                    "access_token": session["access_token"],
+                    "email": email_address,
+                    "avatar": avatar,
+                    "nickname": user_info["en_name"],
+                    "login_channel": "feishu",
+                    "last_login_time": get_format_time(),
+                    "is_superuser": False,
+                },
+            )
+            if not users:
+                raise Exception(f"Fail to register {email_address}.")
+            if len(users) > 1:
+                raise Exception(f"Same email: {email_address} exists!")
+
+            # Try to log in
+            user = users[0]
+            login_user(user)
+            return redirect("/?auth=%s" % user.get_id())
+        except Exception as e:
+            rollback_user_registration(user_id)
+            logging.exception(e)
+            return redirect("/?error=%s" % str(e))
+
+    # User has already registered, try to log in
+    user = users[0]
+    user.access_token = get_uuid()
+    login_user(user)
+    user.save()
+    return redirect("/?auth=%s" % user.get_id())
+
+
+def user_info_from_feishu(access_token):
+    import requests
+
+    headers = {
+        "Content-Type": "application/json; charset=utf-8",
+        "Authorization": f"Bearer {access_token}",
+    }
+    res = requests.get(
+        "https://open.feishu.cn/open-apis/authen/v1/user_info", headers=headers
+    )
+    user_info = res.json()["data"]
+    user_info["email"] = None if user_info.get("email") == "" else user_info["email"]
+    return user_info
+
+
+def user_info_from_github(access_token):
+    import requests
+
+    headers = {"Accept": "application/json", "Authorization": f"token {access_token}"}
+    res = requests.get(
+        f"https://api.github.com/user?access_token={access_token}", headers=headers
+    )
+    user_info = res.json()
+    email_info = requests.get(
+        f"https://api.github.com/user/emails?access_token={access_token}",
+        headers=headers,
+    ).json()
+    user_info["email"] = next(
+        (email for email in email_info if email["primary"]), None
+    )["email"]
+    return user_info
+
+
+@manager.route("/logout", methods=["GET"])  # noqa: F821
+@login_required
+def log_out():
+    """
+    User logout endpoint.
+    ---
+    tags:
+      - User
+    security:
+      - ApiKeyAuth: []
+    responses:
+      200:
+        description: Logout successful.
+        schema:
+          type: object
+    """
+    current_user.access_token = ""
+    current_user.save()
+    logout_user()
+    return get_json_result(data=True)
+
+
+@manager.route("/setting", methods=["POST"])  # noqa: F821
+@login_required
+def setting_user():
+    """
+    Update user settings.
+    ---
+    tags:
+      - User
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: body
+        name: body
+        description: User settings to update.
+        required: true
+        schema:
+          type: object
+          properties:
+            nickname:
+              type: string
+              description: New nickname.
+            email:
+              type: string
+              description: New email.
+    responses:
+      200:
+        description: Settings updated successfully.
+        schema:
+          type: object
+    """
+    update_dict = {}
+    request_data = request.json
+    if request_data.get("password"):
+        new_password = request_data.get("new_password")
+        if not check_password_hash(
+                current_user.password, decrypt(request_data["password"])
+        ):
+            return get_json_result(
+                data=False,
+                code=settings.RetCode.AUTHENTICATION_ERROR,
+                message="Password error!",
+            )
+
+        if new_password:
+            update_dict["password"] = generate_password_hash(decrypt(new_password))
+
+    for k in request_data.keys():
+        if k in [
+            "password",
+            "new_password",
+            "email",
+            "status",
+            "is_superuser",
+            "login_channel",
+            "is_anonymous",
+            "is_active",
+            "is_authenticated",
+            "last_login_time",
+        ]:
+            continue
+        update_dict[k] = request_data[k]
+
+    try:
+        UserService.update_by_id(current_user.id, update_dict)
+        return get_json_result(data=True)
+    except Exception as e:
+        logging.exception(e)
+        return get_json_result(
+            data=False, message="Update failure!", code=settings.RetCode.EXCEPTION_ERROR
+        )
+
+
+@manager.route("/info", methods=["GET"])  # noqa: F821
+@login_required
+def user_profile():
+    """
+    Get user profile information.
+    ---
+    tags:
+      - User
+    security:
+      - ApiKeyAuth: []
+    responses:
+      200:
+        description: User profile retrieved successfully.
+        schema:
+          type: object
+          properties:
+            id:
+              type: string
+              description: User ID.
+            nickname:
+              type: string
+              description: User nickname.
+            email:
+              type: string
+              description: User email.
+    """
+    return get_json_result(data=current_user.to_dict())
+
+
+def rollback_user_registration(user_id):
+    try:
+        UserService.delete_by_id(user_id)
+    except Exception:
+        pass
+    try:
+        TenantService.delete_by_id(user_id)
+    except Exception:
+        pass
+    try:
+        u = UserTenantService.query(tenant_id=user_id)
+        if u:
+            UserTenantService.delete_by_id(u[0].id)
+    except Exception:
+        pass
+    try:
+        TenantLLM.delete().where(TenantLLM.tenant_id == user_id).execute()
+    except Exception:
+        pass
+
+
+def user_register(user_id, user):
+    user["id"] = user_id
+    tenant = {
+        "id": user_id,
+        "name": user["nickname"] + "‘s Kingdom",
+        "llm_id": settings.CHAT_MDL,
+        "embd_id": settings.EMBEDDING_MDL,
+        "asr_id": settings.ASR_MDL,
+        "parser_ids": settings.PARSERS,
+        "img2txt_id": settings.IMAGE2TEXT_MDL,
+        "rerank_id": settings.RERANK_MDL,
+    }
+    usr_tenant = {
+        "tenant_id": user_id,
+        "user_id": user_id,
+        "invited_by": user_id,
+        "role": UserTenantRole.OWNER,
+    }
+    file_id = get_uuid()
+    file = {
+        "id": file_id,
+        "parent_id": file_id,
+        "tenant_id": user_id,
+        "created_by": user_id,
+        "name": "/",
+        "type": FileType.FOLDER.value,
+        "size": 0,
+        "location": "",
+    }
+    tenant_llm = []
+    for llm in LLMService.query(fid=settings.LLM_FACTORY):
+        tenant_llm.append(
+            {
+                "tenant_id": user_id,
+                "llm_factory": settings.LLM_FACTORY,
+                "llm_name": llm.llm_name,
+                "model_type": llm.model_type,
+                "api_key": settings.API_KEY,
+                "api_base": settings.LLM_BASE_URL,
+                "max_tokens": llm.max_tokens if llm.max_tokens else 8192
+            }
+        )
+
+    if not UserService.save(**user):
+        return
+    TenantService.insert(**tenant)
+    UserTenantService.insert(**usr_tenant)
+    TenantLLMService.insert_many(tenant_llm)
+    FileService.insert(file)
+    return UserService.query(email=user["email"])
+
+
+@manager.route("/register", methods=["POST"])  # noqa: F821
+@validate_request("nickname", "email", "password")
+def user_add():
+    """
+    Register a new user.
+    ---
+    tags:
+      - User
+    parameters:
+      - in: body
+        name: body
+        description: Registration details.
+        required: true
+        schema:
+          type: object
+          properties:
+            nickname:
+              type: string
+              description: User nickname.
+            email:
+              type: string
+              description: User email.
+            password:
+              type: string
+              description: User password.
+    responses:
+      200:
+        description: Registration successful.
+        schema:
+          type: object
+    """
+    req = request.json
+    email_address = req["email"]
+
+    # Validate the email address
+    if not re.match(r"^[\w\._-]+@([\w_-]+\.)+[\w-]{2,}$", email_address):
+        return get_json_result(
+            data=False,
+            message=f"Invalid email address: {email_address}!",
+            code=settings.RetCode.OPERATING_ERROR,
+        )
+
+    # Check if the email address is already used
+    if UserService.query(email=email_address):
+        return get_json_result(
+            data=False,
+            message=f"Email: {email_address} has already registered!",
+            code=settings.RetCode.OPERATING_ERROR,
+        )
+
+    # Construct user info data
+    nickname = req["nickname"]
+    user_dict = {
+        "access_token": get_uuid(),
+        "email": email_address,
+        "nickname": nickname,
+        "password": decrypt(req["password"]),
+        "login_channel": "password",
+        "last_login_time": get_format_time(),
+        "is_superuser": False,
+    }
+
+    user_id = get_uuid()
+    try:
+        users = user_register(user_id, user_dict)
+        if not users:
+            raise Exception(f"Fail to register {email_address}.")
+        if len(users) > 1:
+            raise Exception(f"Same email: {email_address} exists!")
+        user = users[0]
+        login_user(user)
+        return construct_response(
+            data=user.to_json(),
+            auth=user.get_id(),
+            message=f"{nickname}, welcome aboard!",
+        )
+    except Exception as e:
+        rollback_user_registration(user_id)
+        logging.exception(e)
+        return get_json_result(
+            data=False,
+            message=f"User registration failure, error: {str(e)}",
+            code=settings.RetCode.EXCEPTION_ERROR,
+        )
+
+
+@manager.route("/tenant_info", methods=["GET"])  # noqa: F821
+@login_required
+def tenant_info():
+    """
+    Get tenant information.
+    ---
+    tags:
+      - Tenant
+    security:
+      - ApiKeyAuth: []
+    responses:
+      200:
+        description: Tenant information retrieved successfully.
+        schema:
+          type: object
+          properties:
+            tenant_id:
+              type: string
+              description: Tenant ID.
+            name:
+              type: string
+              description: Tenant name.
+            llm_id:
+              type: string
+              description: LLM ID.
+            embd_id:
+              type: string
+              description: Embedding model ID.
+    """
+    try:
+        tenants = TenantService.get_info_by(current_user.id)
+        if not tenants:
+            return get_data_error_result(message="Tenant not found!")
+        return get_json_result(data=tenants[0])
+    except Exception as e:
+        return server_error_response(e)
+
+
+@manager.route("/set_tenant_info", methods=["POST"])  # noqa: F821
+@login_required
+@validate_request("tenant_id", "asr_id", "embd_id", "img2txt_id", "llm_id")
+def set_tenant_info():
+    """
+    Update tenant information.
+    ---
+    tags:
+      - Tenant
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: body
+        name: body
+        description: Tenant information to update.
+        required: true
+        schema:
+          type: object
+          properties:
+            tenant_id:
+              type: string
+              description: Tenant ID.
+            llm_id:
+              type: string
+              description: LLM ID.
+            embd_id:
+              type: string
+              description: Embedding model ID.
+            asr_id:
+              type: string
+              description: ASR model ID.
+            img2txt_id:
+              type: string
+              description: Image to Text model ID.
+    responses:
+      200:
+        description: Tenant information updated successfully.
+        schema:
+          type: object
+    """
+    req = request.json
+    try:
+        tid = req.pop("tenant_id")
+        TenantService.update_by_id(tid, req)
+        return get_json_result(data=True)
+    except Exception as e:
+        return server_error_response(e)
--- a/Show More
+++ b/Show More
				`@ -0,0 +1 @@`
				`from .deep_research import DeepResearcher as DeepResearcher`