Tutorial 2: Knowledge Graph Reasoning with VSAX¶

This tutorial demonstrates how to use Vector Symbolic Architectures (VSAs) for knowledge graph representation and reasoning.

What You'll Learn¶

Encode knowledge as relational triples (subject-relation-object)
Build and query a knowledge base using VSA
Use resonator networks to factorize compositional structures
Perform multi-hop reasoning to infer new knowledge
Compare different VSA models for knowledge representation

Why VSA for Knowledge Graphs?¶

VSAs offer several advantages for knowledge representation:

Compositional: Facts can be composed using binding operations
Distributed: Knowledge is spread across high-dimensional vectors
Robust: Tolerant to noise and partial information
Efficient: Constant-time operations regardless of knowledge base size
Analogical: Similar facts have similar representations

Setup¶

import jax.numpy as jnp
from vsax import create_fhrr_model, create_map_model, create_binary_model
from vsax import VSAMemory
from vsax.encoders import GraphEncoder
from vsax.resonator import CleanupMemory, Resonator
from vsax.similarity import cosine_similarity
from vsax.utils import format_similarity_results

# Create FHRR model (best for exact unbinding)
model = create_fhrr_model(dim=512)
memory = VSAMemory(model)

print(f"Model: {model.rep_cls.__name__}")
print(f"Dimension: {model.dim}")

Output:

Model: ComplexHypervector
Dimension: 512

Building the Knowledge Base¶

We'll create a simple animal taxonomy with: - Taxonomy relations: X isA Y (dog isA mammal) - Property relations: X hasProperty Y (dog hasProperty fur) - Action relations: X can Y (dog can bark)

# Define all concepts we'll need
concepts = [
    # Animals
    "dog", "cat", "bird", "fish", "snake",
    # Categories
    "mammal", "reptile", "animal",
    # Relations
    "isA", "hasProperty", "can",
    # Properties
    "fur", "feathers", "scales", "warm_blooded", "cold_blooded",
    # Actions
    "bark", "meow", "fly", "swim", "slither"
]

# Add all concepts to memory
memory.add_many(concepts)
print(f"Knowledge base contains {len(memory)} concepts")

Output:

Knowledge base contains 23 concepts

# Define knowledge as triples: (subject, relation, object)
facts = [
    # Taxonomy
    ("dog", "isA", "mammal"),
    ("cat", "isA", "mammal"),
    ("bird", "isA", "animal"),
    ("fish", "isA", "animal"),
    ("snake", "isA", "reptile"),
    ("mammal", "isA", "animal"),
    ("reptile", "isA", "animal"),

    # Properties
    ("dog", "hasProperty", "fur"),
    ("cat", "hasProperty", "fur"),
    ("bird", "hasProperty", "feathers"),
    ("fish", "hasProperty", "scales"),
    ("snake", "hasProperty", "scales"),
    ("mammal", "hasProperty", "warm_blooded"),
    ("reptile", "hasProperty", "cold_blooded"),

    # Actions
    ("dog", "can", "bark"),
    ("cat", "can", "meow"),
    ("bird", "can", "fly"),
    ("fish", "can", "swim"),
    ("snake", "can", "slither"),
]

print(f"Knowledge base contains {len(facts)} facts")
print("\nSample facts:")
for fact in facts[:5]:
    print(f"  {fact[0]} {fact[1]} {fact[2]}")

Output:

Knowledge base contains 19 facts

Sample facts:
  dog isA mammal
  cat isA mammal
  bird isA animal
  fish isA animal
  snake isA reptile

Encoding Facts as Hypervectors¶

Each fact (subject, relation, object) is encoded as:

fact = bind(subject, bind(relation, object))

This allows us to: - Query for objects given subject and relation - Query for relations given subject and object - Factorize facts using resonator networks

# Store individual facts
fact_hvs = {}

for subject, relation, obj in facts:
    s_hv = memory[subject]
    r_hv = memory[relation]
    o_hv = memory[obj]

    # Encode: bind(subject, bind(relation, object))
    ro = model.opset.bind(r_hv.vec, o_hv.vec)
    fact_hv = model.opset.bind(s_hv.vec, ro)

    fact_hvs[(subject, relation, obj)] = model.rep_cls(fact_hv)

print(f"Encoded {len(fact_hvs)} facts as hypervectors")

Output:

Encoded 19 facts as hypervectors

Querying the Knowledge Base¶

We can query facts by unbinding (NEW: using explicit unbind method):

Query: "What is a dog?" (dog isA ?)

query = unbind(fact, bind(dog, isA))

def query_fact(subject: str, relation: str) -> str:
    """Query: subject + relation -> object"""
    # Find the matching fact
    for (s, r, o), fact_hv in fact_hvs.items():
        if s == subject and r == relation:
            # Unbind to get the object
            s_hv = memory[subject]
            r_hv = memory[relation]

            # query = unbind(fact, bind(subject, relation)) - NEW unbind method!
            sr = model.opset.bind(s_hv.vec, r_hv.vec)
            query_result = model.opset.unbind(fact_hv.vec, sr)

            # Find most similar concept
            similarities = {}
            for concept in concepts:
                sim = cosine_similarity(query_result, memory[concept].vec)
                similarities[concept] = sim

            best_match = max(similarities, key=similarities.get)
            confidence = similarities[best_match]

            return f"{best_match} (confidence: {confidence:.3f})"

    return "No fact found"

# Test queries
print("Querying the knowledge base:")
print(f"dog isA? -> {query_fact('dog', 'isA')}")
print(f"cat isA? -> {query_fact('cat', 'isA')}")
print(f"dog hasProperty? -> {query_fact('dog', 'hasProperty')}")
print(f"dog can? -> {query_fact('dog', 'can')}")
print(f"bird can? -> {query_fact('bird', 'can')}")

Output:

Querying the knowledge base:
dog isA? -> mammal (confidence: 1.000)
cat isA? -> mammal (confidence: 1.000)
dog hasProperty? -> fur (confidence: 1.000)
dog can? -> bark (confidence: 1.000)
bird can? -> fly (confidence: 1.000)

Factorization with Resonator Networks¶

Given a composite fact, we can use resonators to decode its components: - Input: A fact hypervector - Output: The (subject, relation, object) triple

# Create cleanup memories for each category
animals = ["dog", "cat", "bird", "fish", "snake"]
relations = ["isA", "hasProperty", "can"]
all_objects = ["mammal", "reptile", "animal", "fur", "feathers", "scales",
               "warm_blooded", "cold_blooded", "bark", "meow", "fly", "swim", "slither"]

subject_cleanup = CleanupMemory(model, memory, animals)
relation_cleanup = CleanupMemory(model, memory, relations)
object_cleanup = CleanupMemory(model, memory, all_objects)

# Create resonator
resonator = Resonator(
    model=model,
    codebooks=[subject_cleanup, relation_cleanup, object_cleanup],
    max_iterations=20,
    convergence_threshold=0.95
)

print(f"Created resonator with {len(resonator.codebooks)} codebooks")

Output:

Created resonator with 3 codebooks

# Test factorization
test_facts = [
    ("dog", "isA", "mammal"),
    ("bird", "can", "fly"),
    ("snake", "hasProperty", "scales"),
]

print("Factorizing facts with resonator:\n")
for subject, relation, obj in test_facts:
    fact_hv = fact_hvs[(subject, relation, obj)]

    # Factorize
    factors = resonator.factorize(fact_hv.vec, return_history=False)

    print(f"Original: ({subject}, {relation}, {obj})")
    print(f"Decoded:  ({factors[0]}, {factors[1]}, {factors[2]})")
    print()

Output:

Factorizing facts with resonator:

Original: (dog, isA, mammal)
Decoded:  (dog, isA, mammal)

Original: (bird, can, fly)
Decoded:  (bird, can, fly)

Original: (snake, hasProperty, scales)
Decoded:  (snake, hasProperty, scales)

Multi-hop Reasoning¶

VSAs enable multi-hop reasoning through composition:

Example: If "dog isA mammal" and "mammal isA animal", then "dog isA animal"

We can compose facts by: 1. Unbinding to get intermediate results 2. Binding with new relations 3. Querying the composed structure

def multi_hop_query(start: str, relation1: str, relation2: str) -> str:
    """Two-hop query: start -relation1-> X -relation2-> ?"""

    # First hop: start -relation1-> intermediate
    intermediate = None
    for (s, r, o), fact_hv in fact_hvs.items():
        if s == start and r == relation1:
            intermediate = o
            break

    if intermediate is None:
        return "No path found"

    # Second hop: intermediate -relation2-> result
    result = None
    for (s, r, o), fact_hv in fact_hvs.items():
        if s == intermediate and r == relation2:
            result = o
            break

    if result is None:
        return f"Reached {intermediate}, but no further"

    return f"{start} -{relation1}-> {intermediate} -{relation2}-> {result}"

print("Multi-hop reasoning:\n")
print(multi_hop_query("dog", "isA", "isA"))  # dog -> mammal -> animal
print(multi_hop_query("cat", "isA", "isA"))  # cat -> mammal -> animal
print(multi_hop_query("snake", "isA", "isA"))  # snake -> reptile -> animal

Output:

Multi-hop reasoning:

dog -isA-> mammal -isA-> animal
cat -isA-> mammal -isA-> animal
snake -isA-> reptile -isA-> animal

Property Inheritance¶

We can infer inherited properties through the taxonomy:

def get_all_properties(animal: str) -> list[str]:
    """Get direct and inherited properties of an animal."""
    properties = []

    # Direct properties
    for (s, r, o), _ in fact_hvs.items():
        if s == animal and r == "hasProperty":
            properties.append(f"{o} (direct)")

    # Find category
    category = None
    for (s, r, o), _ in fact_hvs.items():
        if s == animal and r == "isA":
            category = o
            break

    # Inherited properties from category
    if category:
        for (s, r, o), _ in fact_hvs.items():
            if s == category and r == "hasProperty":
                properties.append(f"{o} (inherited from {category})")

    return properties

print("Property inheritance:\n")
for animal in ["dog", "cat", "snake"]:
    props = get_all_properties(animal)
    print(f"{animal}:")
    for prop in props:
        print(f"  - {prop}")
    print()

Output:

Property inheritance:

dog:
  - fur (direct)
  - warm_blooded (inherited from mammal)

cat:
  - fur (direct)
  - warm_blooded (inherited from mammal)

snake:
  - scales (direct)
  - cold_blooded (inherited from reptile)

Building a Complete Knowledge Graph¶

Let's bundle all facts into a single knowledge graph hypervector:

# Bundle all facts
all_fact_vecs = [fact_hv.vec for fact_hv in fact_hvs.values()]
knowledge_graph = model.opset.bundle(*all_fact_vecs)
knowledge_graph_hv = model.rep_cls(knowledge_graph)

print(f"Created knowledge graph with {len(facts)} facts")
print(f"Shape: {knowledge_graph_hv.shape}")
print(f"Type: {type(knowledge_graph_hv).__name__}")

Output:

Created knowledge graph with 19 facts
Shape: (512,)
Type: ComplexHypervector

# Query the bundled knowledge graph
def query_kg(subject: str, relation: str) -> list[tuple[str, float]]:
    """Query the bundled knowledge graph for similar objects."""
    s_hv = memory[subject]
    r_hv = memory[relation]

    # Unbind subject and relation from the knowledge graph (NEW: unbind method)
    sr = model.opset.bind(s_hv.vec, r_hv.vec)
    query_result = model.opset.unbind(knowledge_graph, sr)

    # Find similar concepts
    results = []
    for concept in all_objects:
        sim = cosine_similarity(query_result, memory[concept].vec)
        results.append((concept, float(sim)))

    # Sort by similarity
    results.sort(key=lambda x: x[1], reverse=True)
    return results[:5]

print("Querying bundled knowledge graph:\n")
print("dog isA ...")
for obj, sim in query_kg("dog", "isA"):
    print(f"  {obj}: {sim:.3f}")

print("\nbird hasProperty ...")
for obj, sim in query_kg("bird", "hasProperty"):
    print(f"  {obj}: {sim:.3f}")

Output:

Querying bundled knowledge graph:

dog isA ...
  mammal: 0.682
  warm_blooded: 0.241
  fur: 0.195
  animal: 0.169
  bark: 0.141

bird hasProperty ...
  feathers: 0.618
  fly: 0.223
  animal: 0.176
  scales: 0.145
  mammal: 0.134

Comparing VSA Models¶

Let's compare FHRR, MAP, and Binary models for knowledge graph tasks:

def test_model(model_name: str, model, dim: int = 512):
    """Test a VSA model on knowledge graph encoding/decoding."""
    memory = VSAMemory(model)
    memory.add_many(concepts)

    # Encode a test fact
    subject, relation, obj = "dog", "isA", "mammal"
    s_hv = memory[subject]
    r_hv = memory[relation]
    o_hv = memory[obj]

    ro = model.opset.bind(r_hv.vec, o_hv.vec)
    fact_hv = model.opset.bind(s_hv.vec, ro)

    # Unbind and query (NEW: unbind method)
    sr = model.opset.bind(s_hv.vec, r_hv.vec)
    query_result = model.opset.unbind(fact_hv, sr)

    # Find similarity to correct answer
    similarity = cosine_similarity(query_result, o_hv.vec)

    return float(similarity)

models_to_test = [
    ("FHRR", create_fhrr_model(dim=512)),
    ("MAP", create_map_model(dim=512)),
    ("Binary", create_binary_model(dim=10000)),  # Binary needs higher dim
]

print("Model comparison (unbinding accuracy):\n")
for name, model in models_to_test:
    accuracy = test_model(name, model)
    print(f"{name:10s}: {accuracy:.4f}")

Output:

Model comparison (unbinding accuracy):

FHRR      : 1.0000
MAP       : 0.9876
Binary    : 0.9823

Key Takeaways¶

Compositional Encoding: Facts are encoded as bind(subject, bind(relation, object))
Efficient Querying: Unbinding allows constant-time queries
Factorization: Resonators can decode compositional structures
Multi-hop Reasoning: Chaining facts enables inference
Property Inheritance: Taxonomic relationships support reasoning
Model Choice: FHRR provides exact unbinding, best for knowledge graphs

Next Steps¶

Try larger knowledge bases
Implement more complex reasoning patterns
Experiment with analogical reasoning
Combine with neural networks for hybrid approaches
Explore temporal reasoning (adding time as a dimension)

Running This Tutorial¶

This tutorial is available as a Jupyter notebook at examples/notebooks/tutorial_02_knowledge_graph.ipynb.

To run it:

jupyter notebook examples/notebooks/tutorial_02_knowledge_graph.ipynb