Skip to content

Feature add tigergraph support#2315

Open
alexander-belikov wants to merge 10 commits intoHKUDS:mainfrom
alexander-belikov:feature-add-tigergraph-support
Open

Feature add tigergraph support#2315
alexander-belikov wants to merge 10 commits intoHKUDS:mainfrom
alexander-belikov:feature-add-tigergraph-support

Conversation

@alexander-belikov
Copy link
Copy Markdown

@alexander-belikov alexander-belikov commented Nov 5, 2025


Related Issues

This PR adds TigerGraph as a new graph storage backend option for LightRAG, expanding the available graph database integrations alongside existing options (Neo4j, Memgraph, PostgreSQL, MongoDB, NetworkX).

Changes Made

1. New TigerGraph Storage Implementation (lightrag/kg/tigergraph_impl.py)

  • New file: Complete implementation of TigerGraphStorage class extending BaseGraphStorage

  • All abstract methods implemented:

    • Node operations: has_node, get_node, get_nodes_batch, upsert_node, delete_node, remove_nodes
    • Edge operations: has_edge, get_edge, get_edges_batch, upsert_edge, remove_edges
    • Graph traversal: get_node_edges, get_nodes_edges_batch, get_knowledge_graph
    • Query operations: node_degree, node_degrees_batch, edge_degree, edge_degrees_batch
    • Label operations: get_all_labels, get_popular_labels, search_labels
    • Chunk operations: get_nodes_by_chunk_ids, get_edges_by_chunk_ids
    • Bulk operations: get_all_nodes, get_all_edges
    • Lifecycle: initialize, finalize, index_done_callback, drop
  • Key features:

    • Uses pyTigerGraph Python driver with async wrappers (asyncio.to_thread)
    • URI-based connection pattern (similar to Neo4j): TIGERGRAPH_URI
    • Workspace isolation using workspace label as vertex type name
    • Automatic schema creation on initialization
    • Undirected edge support (DIRECTED edge type for compatibility)
    • Chinese text support in label search
    • Retry logic with tenacity for write operations
    • Error handling and logging consistent with other backends

2. Storage Registry Updates (lightrag/kg/__init__.py)

  • Added TigerGraphStorage to GRAPH_STORAGE implementations list
  • Added environment variable requirements: TIGERGRAPH_URI, TIGERGRAPH_USERNAME, TIGERGRAPH_PASSWORD
  • Added module mapping: "TigerGraphStorage": ".kg.tigergraph_impl"

3. Configuration Examples

  • env.example: Added TigerGraph configuration section with:

    • TIGERGRAPH_URI (default: http://localhost:9000)
    • TIGERGRAPH_USERNAME (default: tigergraph)
    • TIGERGRAPH_PASSWORD (required)
    • TIGERGRAPH_GRAPH_NAME (default: lightrag)
    • TIGERGRAPH_WORKSPACE (optional, for workspace override)
  • config.ini.example: Added [tigergraph] section with corresponding configuration options

Implementation Details

  • Schema Design: Follows Neo4j pattern with workspace-based vertex types
  • Primary Key: Uses entity_id as vertex primary key (STRING type)
  • Properties: Supports entity_type, description, keywords, source_id with dynamic attribute support
  • Edge Type: Single "DIRECTED" undirected edge type for all relationships
  • Async Compatibility: All synchronous pyTigerGraph calls wrapped in asyncio.to_thread() for async compatibility
  • Connection Management: Uses get_data_init_lock() and get_graph_db_lock() for thread-safe initialization

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

Testing Requirements

  • Requires a running TigerGraph instance for testing
  • Default connection: http://localhost:9000
  • Graph must be created or will use default graph name from configuration
  • Schema is automatically created on first initialization

Dependencies

  • pyTigerGraph:Install via pip install -e ".[offline-storage]" or automatically installed via pipmaster
  • No changes to existing dependencies or requirements files

Compatibility

  • Follows the same patterns as Neo4j implementation for consistency
  • Compatible with existing LightRAG workflows
  • Maintains workspace isolation similar to other backends
  • Supports all query modes (local, global, hybrid, naive, mix)

Potential Future Enhancements

  • Batch query optimization for better performance with large datasets
  • TigerVector integration for native vector search capabilities
  • Custom GSQL queries for complex graph traversals
  • Connection pooling optimization for high-throughput scenarios

Known Limitations

  • Some operations (like get_nodes_batch) iterate through nodes sequentially due to TigerGraph API limitations
  • Schema creation uses GSQL commands which require appropriate permissions
  • Large graph queries may need optimization for production use

Comment thread lightrag/kg/tigergraph_impl.py Outdated
def _search_labels():
try:
# Get all vertices and filter
vertices = self._conn.getVertices(workspace_label, limit=100000)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Querying via traversal is inefficient and not recommended.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread lightrag/kg/tigergraph_impl.py Outdated
def _get_popular_labels():
try:
# Get all vertices and calculate degrees
vertices = self._conn.getVertices(workspace_label, limit=100000)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Querying via traversal is inefficient and not recommended.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread lightrag/kg/tigergraph_impl.py Outdated
edges_dict[node_id] = edges if edges is not None else []
return edges_dict

async def get_nodes_by_chunk_ids(self, chunk_ids: list[str]) -> list[dict]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_nodes_by_chunk_ids is deprecated

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread lightrag/kg/tigergraph_impl.py Outdated

return await asyncio.to_thread(_get_nodes_by_chunk_ids)

async def get_edges_by_chunk_ids(self, chunk_ids: list[str]) -> list[dict]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_edges_by_chunk_ids is deprecated

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@danielaskdd
Copy link
Copy Markdown
Collaborator

Thank you for your contribution and your interest in LightRAG. At this stage, we are not merging new storage implementations into the main branch. As LightRAG is undergoing rapid iteration, maintaining multiple backends introduces significant overhead in compatibility testing, performance tuning, and data migration, which currently exceeds our team's operational capacity.

@danielaskdd danielaskdd added the enhancement New feature or request label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants