Skip to content

Latest commit

 

History

History
236 lines (147 loc) · 6.38 KB

File metadata and controls

236 lines (147 loc) · 6.38 KB

Create vector store file

post /vector_stores/{vector_store_id}/files

Create a vector store file by attaching a File to a vector store.

Path Parameters

  • vector_store_id: string

Body Parameters

  • file_id: string

    A File ID that the vector store should use. Useful for tools like file_search that can access files. For multi-file ingestion, we recommend file_batches to minimize per-vector-store write requests.

  • attributes: optional map[string or number or boolean]

    Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.

    • string

    • number

    • boolean

  • chunking_strategy: optional FileChunkingStrategyParam

    The chunking strategy used to chunk the file(s). If not set, will use the auto strategy.

    • AutoFileChunkingStrategyParam object { type }

      The default strategy. This strategy currently uses a max_chunk_size_tokens of 800 and chunk_overlap_tokens of 400.

      • type: "auto"

        Always auto.

        • "auto"
    • StaticFileChunkingStrategyObjectParam object { static, type }

      Customize your own chunking strategy by setting chunk size and chunk overlap.

      • static: StaticFileChunkingStrategy

        • chunk_overlap_tokens: number

          The number of tokens that overlap between chunks. The default value is 400.

          Note that the overlap must not exceed half of max_chunk_size_tokens.

        • max_chunk_size_tokens: number

          The maximum number of tokens in each chunk. The default value is 800. The minimum value is 100 and the maximum value is 4096.

      • type: "static"

        Always static.

        • "static"

Returns

  • VectorStoreFile object { id, created_at, last_error, 6 more }

    A list of files attached to a vector store.

    • id: string

      The identifier, which can be referenced in API endpoints.

    • created_at: number

      The Unix timestamp (in seconds) for when the vector store file was created.

    • last_error: object { code, message }

      The last error associated with this vector store file. Will be null if there are no errors.

      • code: "server_error" or "unsupported_file" or "invalid_file"

        One of server_error, unsupported_file, or invalid_file.

        • "server_error"

        • "unsupported_file"

        • "invalid_file"

      • message: string

        A human-readable description of the error.

    • object: "vector_store.file"

      The object type, which is always vector_store.file.

      • "vector_store.file"
    • status: "in_progress" or "completed" or "cancelled" or "failed"

      The status of the vector store file, which can be either in_progress, completed, cancelled, or failed. The status completed indicates that the vector store file is ready for use.

      • "in_progress"

      • "completed"

      • "cancelled"

      • "failed"

    • usage_bytes: number

      The total vector store usage in bytes. Note that this may be different from the original file size.

    • vector_store_id: string

      The ID of the vector store that the File is attached to.

    • attributes: optional map[string or number or boolean]

      Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.

      • string

      • number

      • boolean

    • chunking_strategy: optional StaticFileChunkingStrategyObject or OtherFileChunkingStrategyObject

      The strategy used to chunk the file.

      • StaticFileChunkingStrategyObject object { static, type }

        • static: StaticFileChunkingStrategy

          • chunk_overlap_tokens: number

            The number of tokens that overlap between chunks. The default value is 400.

            Note that the overlap must not exceed half of max_chunk_size_tokens.

          • max_chunk_size_tokens: number

            The maximum number of tokens in each chunk. The default value is 800. The minimum value is 100 and the maximum value is 4096.

        • type: "static"

          Always static.

          • "static"
      • OtherFileChunkingStrategyObject object { type }

        This is returned when the chunking strategy is unknown. Typically, this is because the file was indexed before the chunking_strategy concept was introduced in the API.

        • type: "other"

          Always other.

          • "other"

Example

curl https://api.openai.com/v1/vector_stores/$VECTOR_STORE_ID/files \
    -H 'Content-Type: application/json' \
    -H 'OpenAI-Beta: assistants=v2' \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -d '{
          "file_id": "file_id"
        }'

Response

{
  "id": "id",
  "created_at": 0,
  "last_error": {
    "code": "server_error",
    "message": "message"
  },
  "object": "vector_store.file",
  "status": "in_progress",
  "usage_bytes": 0,
  "vector_store_id": "vector_store_id",
  "attributes": {
    "foo": "string"
  },
  "chunking_strategy": {
    "static": {
      "chunk_overlap_tokens": 0,
      "max_chunk_size_tokens": 100
    },
    "type": "static"
  }
}

Example

curl https://api.openai.com/v1/vector_stores/vs_abc123/files \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -H "Content-Type: application/json" \
    -H "OpenAI-Beta: assistants=v2" \
    -d '{
      "file_id": "file-abc123"
    }'

Response

{
  "id": "file-abc123",
  "object": "vector_store.file",
  "created_at": 1699061776,
  "usage_bytes": 1234,
  "vector_store_id": "vs_abcd",
  "status": "completed",
  "last_error": null
}