Skip to content

Lesson 5.3: Research Frontiers & Open Problems

Estimated time: 45 minutes

Learning Objectives

By the end of this lesson, you will be able to:

  • Identify current research directions in Vector Symbolic Architectures
  • Understand open problems and challenges in the field
  • Explore connections between VSA and modern deep learning
  • Recognize opportunities for neuromorphic hardware implementations
  • Find ways to contribute to VSAX and the VSA research community
  • Formulate your own research questions in VSA

Prerequisites

  • Completion of Modules 1-4 (foundational VSA knowledge)
  • Lessons 5.1-5.2 (VFA and custom encoders)
  • Interest in research and pushing boundaries

The State of VSA Research (2025)

Vector Symbolic Architectures have evolved from a niche area to a vibrant research field with applications across:

  • Cognitive Science: Modeling human memory and reasoning
  • Robotics: Efficient control and perception
  • Neuroscience: Understanding neural computation
  • Edge Computing: Lightweight AI for resource-constrained devices
  • Neuromorphic Hardware: Brain-inspired computing architectures

Recent momentum: - πŸ“ˆ Papers at major conferences (NeurIPS, ICML, ICLR, CVPR) - 🏒 Industrial adoption (Intel, IBM, startups) - πŸ”¬ Cross-disciplinary collaborations (neuroscience + ML + hardware)


Research Direction 1: Learning-Based VSA

The Problem

Traditional VSA uses random basis vectors:

memory.add("cat")  # Samples random vector

Limitations: - No semantic structure (random vectors don't capture meaning) - Cannot leverage pre-trained knowledge - Similarity depends on encoding, not inherent meaning

Research Question

Can we learn basis vectors that capture semantic structure?

Approaches

1. Pre-training from Data

Train basis vectors to preserve semantic relationships:

# Hypothetical learned basis
memory.add_learned("cat", embedding_from_bert("cat"))
memory.add_learned("dog", embedding_from_bert("dog"))

# Now cat and dog have inherent similarity!
sim = cosine_similarity(memory["cat"].vec, memory["dog"].vec)  # High!

Current work: - Mikolić et al. (2024): "Learning Hyperdimensional Representations from Data" - VSA + Word2Vec/BERT hybrid models

2. Meta-Learning VSA Operations

Learn optimal binding/bundling strategies for specific domains:

# Learn domain-specific binding operator
learned_bind = LearnedBindingOperator(domain="vision")
scene = learned_bind(object1, object2)  # Better than random binding?

Open problems: - How to integrate gradient-based learning with discrete VSA ops? - Can we learn in high dimensions (10,000+) efficiently? - What inductive biases preserve VSA properties?

Your Research Opportunity

  • Project idea: Train VSA basis vectors using contrastive learning
  • Questions: Do learned bases retain VSA's symbolic properties? How much data is needed?

Research Direction 2: VSA + Large Language Models

The Problem

LLMs are powerful but: - ❌ Opaque reasoning (black box) - ❌ Cannot perform symbolic operations (binding, unbinding) - ❌ Huge computational cost

VSA is efficient but: - ❌ Requires manual feature engineering - ❌ Cannot handle raw text well

Research Question

How can we combine LLM representations with VSA symbolic reasoning?

Approaches

1. LLM Embeddings β†’ VSA Encoding

from transformers import AutoModel, AutoTokenizer

# Get LLM embedding
model_llm = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

text = "The cat sat on the mat"
tokens = tokenizer(text, return_tensors="pt")
embedding = model_llm(**tokens).last_hidden_state.mean(dim=1)  # [CLS] pooling

# Project to VSA space
vsa_hv = project_to_vsa(embedding, dim=2048)

# Now can perform symbolic operations
memory.add("sentence")
memory["sentence"] = model.rep_cls(vsa_hv)

2. VSA-Augmented LLM Reasoning

Use VSA as external symbolic memory for LLMs:

LLM generates text β†’ Extract facts β†’ Encode in VSA β†’ Query VSA β†’ Feed back to LLM

Current work: - Nunes et al. (2024): "Neuro-Symbolic Question Answering with HDC" - VSA for factual knowledge grounding in LLMs

3. Hybrid Architecture

Input β†’ LLM encoder β†’ VSA layer β†’ Reasoning β†’ VSA decoder β†’ LLM decoder β†’ Output
         (perception)   (symbolic)  (inference)  (symbolic)    (generation)

Open problems: - How to backpropagate through VSA operations? - Can VSA help with LLM hallucinations (via symbolic grounding)? - What's the optimal dimensionality for LLM-VSA hybrids?

Your Research Opportunity

  • Project idea: Build a VSA-augmented chatbot that can reason symbolically
  • Questions: Does VSA improve factual accuracy? Can it explain reasoning steps?

Research Direction 3: Neuromorphic Hardware

The Problem

Traditional von Neumann architectures are inefficient for VSA: - High-dimensional vectors (10,000 dims) β†’ memory bottleneck - Binding/bundling are embarrassingly parallel β†’ underutilized

Research Question

Can we build specialized hardware for VSA that's orders of magnitude more efficient?

Approaches

1. In-Memory Computing

Store hypervectors directly in analog memory:

Hypervector [0.3, -0.7, 0.5, ...] β†’ Stored as resistances in ReRAM
Bundling β†’ Parallel current summation (analog addition)
Binding β†’ Element-wise multiply in crossbar array

Advantages: - ⚑ Extremely fast (nanosecond operations) - πŸ”‹ Low power (no data movement) - πŸ“¦ Compact (analog storage)

Current work: - Karunaratne et al. (2020): "In-memory hyperdimensional computing" - IBM neuromorphic chips with HDC

2. Spiking Neural Networks (SNNs) + VSA

Encode hypervectors as spike patterns:

Hypervector component 0.7 β†’ High firing rate neuron
Hypervector component -0.3 β†’ Low firing rate neuron

Advantages: - 🧠 Biologically plausible - ⚑ Event-driven (sparse activity) - πŸ”‹ Ultra-low power

Current work: - Mitrokhin et al. (2023): "Learning sensorimotor representations with spiking HDC" - Intel Loihi neuromorphic chip

3. Photonic Computing

Use light for hypervector operations:

Hypervector β†’ Optical signal (wavelength/phase encoded)
Binding β†’ Optical interference
Bundling β†’ Beam combining

Advantages: - πŸš€ Speed of light computation - ❄️ Minimal heat generation - πŸ”§ Massively parallel

Open problems: - How to handle negative values in analog hardware? - Can we train VSA models directly on neuromorphic chips? - What's the optimal hypervector representation for SNNs?

Your Research Opportunity

  • Project idea: Simulate VSA on spiking neural network simulator (e.g., Brian2, NEST)
  • Questions: What's the energy efficiency gain? Can SNNs learn VSA operations?

Research Direction 4: Dimensionality & Capacity

The Problem

Current VSA uses fixed dimensionality (typically 512-10,000): - Too low β†’ poor capacity, high noise - Too high β†’ wasteful computation and memory

Research Questions

  1. What is the theoretical minimum dimensionality for a given task?
  2. Can we adaptively adjust dimensionality during computation?
  3. Are there compressive encodings that preserve VSA properties at lower dimensions?

Approaches

1. Dimensionality Theory

Johnson-Lindenstrauss Lemma tells us random projections preserve distances.

Can we derive similar guarantees for VSA operations?

Conjecture: For N symbols with Ξ΅ error tolerance, minimum dimension:

d_min β‰₯ O(log(N) / Ρ²)

Open problems: - Exact bounds for binding/bundling capacity - Trade-offs between dimensionality, accuracy, and computational cost

2. Dynamic Dimensionality

class AdaptiveDimensionalityVSA:
    def __init__(self, dim_min=512, dim_max=10000):
        self.current_dim = dim_min

    def bind(self, a, b):
        # Monitor reconstruction error
        result = fhrr_bind(a, b, dim=self.current_dim)
        error = self.estimate_error(result)

        if error > threshold:
            # Increase dimensionality
            self.current_dim = min(self.current_dim * 2, self.dim_max)

        return result

3. Compression Techniques

Apply dimensionality reduction while preserving structure:

# Full VSA at 10,000 dims
hv_full = encode_full(data)

# Compress to 512 dims using learned projection
hv_compressed = learned_projection(hv_full, target_dim=512)

# Can we still unbind/bundle?

Your Research Opportunity

  • Project idea: Empirically measure capacity vs dimensionality for different VSA models
  • Questions: Is there a phase transition in performance? Can we predict required dimensionality from task complexity?

Research Direction 5: Continual Learning

The Problem

Traditional ML suffers from catastrophic forgetting: - Train on task A β†’ good performance - Train on task B β†’ task A performance collapses

VSA has natural advantages: - Bundling is order-invariant (can add facts any time) - High-dimensional space has room for many concepts - No destructive updates

Research Question

Can VSA enable perfect continual learning?

Approaches

1. Incremental Bundling

class ContinualMemory:
    def __init__(self, model):
        self.model = model
        self.knowledge = jnp.zeros(model.dim, dtype=jnp.complex64)

    def learn_fact(self, fact_hv):
        # Simply bundle new fact
        self.knowledge = self.model.opset.bundle(self.knowledge, fact_hv)
        # No forgetting!

    def recall(self, query_hv):
        return self.model.opset.bind(self.knowledge,
                                      self.model.opset.inverse(query_hv))

2. Hierarchical Consolidation

Organize memories in tree structure:

       Root (general knowledge)
      /    |    \
   Math  Science  History  ← Domains
    /\     /\      /\
   ...    ...     ...      ← Specific facts

3. Selective Forgetting

Unbundle old/irrelevant facts to make room:

# Remove outdated fact
outdated_fact_hv = memory["2020_covid_stats"]
knowledge = opset.unbundle(knowledge, outdated_fact_hv)

Open problems: - What's the capacity limit for bundled knowledge? - How to detect and correct interference? - Can we compress old knowledge (lossy but space-efficient)?

Your Research Opportunity

  • Project idea: Build continual learning benchmark for VSA
  • Questions: At what point does bundling degrade? Can we quantify forgetting rate?

Research Direction 6: Interpretability

The Problem

Deep learning is often a black box. VSA promises interpretability: - Unbind to extract components - Similarity to known concepts - Symbolic structure

But challenges remain: - High dimensions are hard to visualize - Approximate unbinding introduces noise - Resonator convergence may fail

Research Question

How can we make VSA reasoning fully transparent and interpretable?

Approaches

1. Visualization Tools

Dimensionality reduction for plotting:

from sklearn.manifold import TSNE
import matplotlib.pyplot as plt

# Encode many concepts
concepts = ["cat", "dog", "car", "tree", "mountain"]
hvs = [memory[c].vec for c in concepts]

# Reduce to 2D for plotting
hvs_2d = TSNE(n_components=2).fit_transform(hvs)

plt.scatter(hvs_2d[:, 0], hvs_2d[:, 1])
for i, concept in enumerate(concepts):
    plt.annotate(concept, (hvs_2d[i, 0], hvs_2d[i, 1]))
plt.show()

2. Explanation Generation

def explain_hypervector(hv, memory, top_k=5):
    """
    Explain what a hypervector represents by finding
    most similar known concepts.
    """
    similarities = []
    for name, stored_hv in memory.items():
        sim = cosine_similarity(hv, stored_hv.vec)
        similarities.append((name, sim))

    similarities.sort(key=lambda x: x[1], reverse=True)

    print(f"This hypervector is most similar to:")
    for name, sim in similarities[:top_k]:
        print(f"  - {name}: {sim:.3f}")

3. Causal Tracing

Track which components contribute to final answer:

# Query: "What is the capital of France?"
query_hv = encode_query("capital of France")

# Unbind to get answer
answer_hv = opset.bind(knowledge_base, opset.inverse(query_hv))

# Which facts in knowledge_base contributed most?
for fact_name, fact_hv in facts.items():
    contribution = cosine_similarity(answer_hv, fact_hv)
    print(f"Fact '{fact_name}' contribution: {contribution:.3f}")

Open problems: - Can we visualize high-dimensional binding structures? - How to present VSA reasoning to non-experts? - Quantifying uncertainty in VSA predictions?

Your Research Opportunity

  • Project idea: Build interactive VSA debugger/visualizer
  • Questions: Can users understand VSA reasoning? What visualizations are most helpful?

Open Problems Summary

Theoretical

  1. Capacity bounds: Exact theoretical limits for binding/bundling
  2. Optimal dimensionality: Principled way to choose dimension for a task
  3. Error propagation: How does noise accumulate in multi-hop reasoning?
  4. Universality: What computations can/cannot be expressed in VSA?

Algorithmic

  1. Learning: How to integrate gradient-based learning with VSA?
  2. Scaling: Efficient VSA for millions of symbols?
  3. Streaming data: Online learning with unbounded data?
  4. Multi-modality: Principled fusion of heterogeneous data?

Applied

  1. Benchmarks: Standard datasets for VSA evaluation?
  2. Real-world deployment: Production-ready VSA systems?
  3. Hybrid architectures: Best way to combine VSA + DNNs?
  4. Energy efficiency: How much better is VSA than DNNs on edge devices?

Contributing to VSAX

How to Get Involved

1. Report Issues

Found a bug or have a feature request? - GitHub Issues: https://github.com/anthropics/vsax/issues - Provide minimal reproducible example - Describe expected vs actual behavior

2. Contribute Code

  • Fork the repository
  • Create feature branch: git checkout -b feature/my-encoder
  • Write tests for your code
  • Submit pull request with clear description

Example contributions: - New encoder for your domain - Performance optimization - Additional examples/tutorials - Bug fixes

3. Share Your Research

  • Publish papers using VSAX
  • Share your custom encoders on GitHub
  • Write blog posts about applications
  • Present at conferences

4. Improve Documentation

  • Fix typos or unclear explanations
  • Add examples
  • Expand API documentation
  • Translate to other languages

Formulating Your Research Question

Template

Given [DOMAIN/PROBLEM],
can we [VSA TECHNIQUE/APPROACH]
to achieve [GOAL/METRIC]
better than [BASELINE]?

Examples

Example 1: Robotics

Given multi-modal sensor streams from a robot,
can we use SensorFusionEncoder + SSP
to achieve real-time localization and mapping
better than traditional SLAM algorithms?

Example 2: NLP

Given large-scale knowledge graphs,
can we use hierarchical VSA encoding + resonators
to achieve one-shot question answering
better than fine-tuned LLMs?

Example 3: Neuroscience

Given fMRI brain activity data,
can we decode mental states using VSA encoders
to achieve interpretable cognitive state classification
better than black-box DNNs?

Your Turn

What research question excites YOU?

Write it down using the template:

Given _______________________,
can we _______________________
to achieve __________________
better than _________________?

Resources for Further Learning

Key Papers

Foundational: 1. Plate (1995): "Holographic Reduced Representations" 2. Kanerva (2009): "Hyperdimensional Computing" 3. Gayler (2003): "Vector Symbolic Architectures"

Recent Surveys: 1. Kleyko et al. (2022): "Vector Symbolic Architectures as a Computing Framework" (IEEE Proceedings) 2. Schlegel et al. (2022): "A Comparison of Vector Symbolic Architectures"

Applications: 1. Imani et al. (2019): "A framework for collaborative learning in secure high-dimensional space" (Edge ML) 2. Ge & Parhi (2021): "Classification using hyperdimensional computing" (Efficient AI)

Conferences & Workshops

  • NeurIPS: Neuro-symbolic AI workshops
  • ICML: Efficient ML track
  • NICE: Neuro-Inspired Computational Elements workshop
  • CoRL: Conference on Robot Learning (VSA for control)

Online Communities

  • VSAX GitHub Discussions: Ask questions, share projects
  • Reddit r/MachineLearning: HDC/VSA threads
  • Twitter #VectorSymbolicArchitectures

Self-Assessment

Before concluding, ensure you can:

  • [ ] Identify at least 3 current research directions in VSA
  • [ ] Explain challenges in learning-based VSA
  • [ ] Describe opportunities for VSA + LLMs
  • [ ] Understand neuromorphic hardware potential
  • [ ] Recognize open theoretical and applied problems
  • [ ] Formulate your own research question
  • [ ] Know how to contribute to VSAX

Final Quiz

Question 1: What is the main advantage of learned basis vectors over random ones?

a) Faster computation b) Semantic structure in the representation space c) Lower memory usage d) Easier to implement

Answer **b) Semantic structure in the representation space** Learned basis vectors can capture inherent semantic relationships (e.g., "cat" and "dog" are similar) rather than being randomly orthogonal. This could improve generalization and reasoning while preserving VSA's compositional properties.

Question 2: Why is neuromorphic hardware promising for VSA?

a) VSA operations are naturally parallel and efficient in analog compute b) Neuromorphic chips are cheaper c) VSA doesn't work on traditional CPUs d) It's just a trend

Answer **a) VSA operations are naturally parallel and efficient in analog compute** Binding and bundling are embarrassingly parallel element-wise operations. In-memory analog computing can perform these operations extremely fast and energy-efficiently by avoiding data movement and leveraging physical properties (current summation, resistance multiplication).

Question 3: What is continual learning's main challenge that VSA might address?

a) Training speed b) Catastrophic forgetting c) Model size d) Data collection

Answer **b) Catastrophic forgetting** Traditional neural networks overwrite old knowledge when learning new tasks. VSA's bundling operation is additive and non-destructive - new facts can be added without erasing old ones, potentially enabling perfect continual learning.

Key Takeaways

βœ“ VSA research is vibrant and growing - opportunities across theory, algorithms, and applications βœ“ Learning-based VSA - integrating gradient learning with symbolic operations βœ“ VSA + LLMs - combining neural perception with symbolic reasoning βœ“ Neuromorphic hardware - orders of magnitude efficiency gains possible βœ“ Open problems abound - dimensionality, capacity, interpretability, continual learning βœ“ You can contribute! - VSAX welcomes encoders, optimizations, and applications


Course Complete!

Congratulations! You've completed the VSAX course covering:

  • βœ… Module 1: Foundational concepts (high dimensions, binding/bundling, three models)
  • βœ… Module 2: Core operations (FHRR, MAP, Binary, similarity, model selection)
  • βœ… Module 3: Encoders & applications (scalars, sequences, images, knowledge graphs, analogies)
  • βœ… Module 4: Advanced techniques (operators, SSP, hierarchical, multi-modal)
  • βœ… Module 5: Research & extensions (VFA, custom encoders, frontiers)

You are now equipped to: - Build VSA-powered applications - Design custom encoders for your domain - Contribute to VSA research - Push the boundaries of hyperdimensional computing


Where to Go From Here

Immediate Next Steps

  1. Build something! Apply VSAX to your research or project
  2. Share your work - Publish code, write blog posts, present at conferences
  3. Join the community - GitHub Discussions, contribute to VSAX
  4. Read papers - Dive deeper into topics that interest you

Advanced Topics (Beyond This Course)

  • Quantum VSA: Using quantum superposition for hypervectors
  • Biological VSA: Modeling neural circuits with VSA
  • VSA for Causality: Encoding causal relationships
  • Federated VSA: Distributed learning with privacy

Research Opportunities

Pick a research direction from this lesson and: 1. Read 3-5 key papers 2. Implement a proof-of-concept 3. Run experiments and analyze results 4. Write it up and share with the community!


Final Words

Vector Symbolic Architectures represent a paradigm shift in how we think about computation:

  • Not neural networks (though compatible with them)
  • Not classical symbolic AI (though shares symbolic properties)
  • A unique fusion of continuous + discrete, distributed + compositional, learned + structured

The field is young, the problems are hard, and the opportunities are immense.

Welcome to the VSA research community!

We can't wait to see what you build.


Acknowledgments

This course was built on decades of research by pioneers: - Tony Plate (Holographic Reduced Representations) - Pentti Kanerva (Hyperdimensional Computing) - Ross Gayler (Vector Symbolic Architectures) - Many others who advanced the field

Thank you for learning with us.

Now go forth and compute in high dimensions!

References

Key Surveys: - Kleyko, D., et al. (2022). "Vector Symbolic Architectures as a Computing Framework for Nanoscale Hardware." Proceedings of the IEEE, 109(8), 1366-1397. - Schlegel, K., Neubert, P., & Protzel, P. (2022). "A Comparison of Vector Symbolic Architectures." Artificial Intelligence Review, 55, 4523-4555.

Foundational Works: - Plate, T. A. (1995). "Holographic Reduced Representations." IEEE Transactions on Neural Networks, 6(3), 623-641. - Kanerva, P. (2009). "Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors." Cognitive Computation, 1, 139-159.

Recent Applications: - Imani, M., et al. (2019). "A Framework for Collaborative Learning in Secure High-Dimensional Space." - Neubert, P., Schubert, S., & Protzel, P. (2019). "Learning Vector Symbolic Architectures for Reactive Robot Behaviours." - Mitrokhin, A., et al. (2023). "Learning sensorimotor representations with spiking HDC."

Neuromorphic Hardware: - Karunaratne, G., et al. (2020). "In-memory hyperdimensional computing." Nature Electronics, 3, 327-337. - Poduval, P., et al. (2021). "HDnn: Hyperdimensional Inference with Spiking Neural Networks."