Lesson 5.3: Research Frontiers & Open Problems¶
Estimated time: 45 minutes
Learning Objectives¶
By the end of this lesson, you will be able to:
- Identify current research directions in Vector Symbolic Architectures
- Understand open problems and challenges in the field
- Explore connections between VSA and modern deep learning
- Recognize opportunities for neuromorphic hardware implementations
- Find ways to contribute to VSAX and the VSA research community
- Formulate your own research questions in VSA
Prerequisites¶
- Completion of Modules 1-4 (foundational VSA knowledge)
- Lessons 5.1-5.2 (VFA and custom encoders)
- Interest in research and pushing boundaries
The State of VSA Research (2025)¶
Vector Symbolic Architectures have evolved from a niche area to a vibrant research field with applications across:
- Cognitive Science: Modeling human memory and reasoning
- Robotics: Efficient control and perception
- Neuroscience: Understanding neural computation
- Edge Computing: Lightweight AI for resource-constrained devices
- Neuromorphic Hardware: Brain-inspired computing architectures
Recent momentum: - π Papers at major conferences (NeurIPS, ICML, ICLR, CVPR) - π’ Industrial adoption (Intel, IBM, startups) - π¬ Cross-disciplinary collaborations (neuroscience + ML + hardware)
Research Direction 1: Learning-Based VSA¶
The Problem¶
Traditional VSA uses random basis vectors:
Limitations: - No semantic structure (random vectors don't capture meaning) - Cannot leverage pre-trained knowledge - Similarity depends on encoding, not inherent meaning
Research Question¶
Can we learn basis vectors that capture semantic structure?
Approaches¶
1. Pre-training from Data
Train basis vectors to preserve semantic relationships:
# Hypothetical learned basis
memory.add_learned("cat", embedding_from_bert("cat"))
memory.add_learned("dog", embedding_from_bert("dog"))
# Now cat and dog have inherent similarity!
sim = cosine_similarity(memory["cat"].vec, memory["dog"].vec) # High!
Current work: - MikoliΔ et al. (2024): "Learning Hyperdimensional Representations from Data" - VSA + Word2Vec/BERT hybrid models
2. Meta-Learning VSA Operations
Learn optimal binding/bundling strategies for specific domains:
# Learn domain-specific binding operator
learned_bind = LearnedBindingOperator(domain="vision")
scene = learned_bind(object1, object2) # Better than random binding?
Open problems: - How to integrate gradient-based learning with discrete VSA ops? - Can we learn in high dimensions (10,000+) efficiently? - What inductive biases preserve VSA properties?
Your Research Opportunity¶
- Project idea: Train VSA basis vectors using contrastive learning
- Questions: Do learned bases retain VSA's symbolic properties? How much data is needed?
Research Direction 2: VSA + Large Language Models¶
The Problem¶
LLMs are powerful but: - β Opaque reasoning (black box) - β Cannot perform symbolic operations (binding, unbinding) - β Huge computational cost
VSA is efficient but: - β Requires manual feature engineering - β Cannot handle raw text well
Research Question¶
How can we combine LLM representations with VSA symbolic reasoning?
Approaches¶
1. LLM Embeddings β VSA Encoding
from transformers import AutoModel, AutoTokenizer
# Get LLM embedding
model_llm = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
text = "The cat sat on the mat"
tokens = tokenizer(text, return_tensors="pt")
embedding = model_llm(**tokens).last_hidden_state.mean(dim=1) # [CLS] pooling
# Project to VSA space
vsa_hv = project_to_vsa(embedding, dim=2048)
# Now can perform symbolic operations
memory.add("sentence")
memory["sentence"] = model.rep_cls(vsa_hv)
2. VSA-Augmented LLM Reasoning
Use VSA as external symbolic memory for LLMs:
Current work: - Nunes et al. (2024): "Neuro-Symbolic Question Answering with HDC" - VSA for factual knowledge grounding in LLMs
3. Hybrid Architecture
Input β LLM encoder β VSA layer β Reasoning β VSA decoder β LLM decoder β Output
(perception) (symbolic) (inference) (symbolic) (generation)
Open problems: - How to backpropagate through VSA operations? - Can VSA help with LLM hallucinations (via symbolic grounding)? - What's the optimal dimensionality for LLM-VSA hybrids?
Your Research Opportunity¶
- Project idea: Build a VSA-augmented chatbot that can reason symbolically
- Questions: Does VSA improve factual accuracy? Can it explain reasoning steps?
Research Direction 3: Neuromorphic Hardware¶
The Problem¶
Traditional von Neumann architectures are inefficient for VSA: - High-dimensional vectors (10,000 dims) β memory bottleneck - Binding/bundling are embarrassingly parallel β underutilized
Research Question¶
Can we build specialized hardware for VSA that's orders of magnitude more efficient?
Approaches¶
1. In-Memory Computing
Store hypervectors directly in analog memory:
Hypervector [0.3, -0.7, 0.5, ...] β Stored as resistances in ReRAM
Bundling β Parallel current summation (analog addition)
Binding β Element-wise multiply in crossbar array
Advantages: - β‘ Extremely fast (nanosecond operations) - π Low power (no data movement) - π¦ Compact (analog storage)
Current work: - Karunaratne et al. (2020): "In-memory hyperdimensional computing" - IBM neuromorphic chips with HDC
2. Spiking Neural Networks (SNNs) + VSA
Encode hypervectors as spike patterns:
Hypervector component 0.7 β High firing rate neuron
Hypervector component -0.3 β Low firing rate neuron
Advantages: - π§ Biologically plausible - β‘ Event-driven (sparse activity) - π Ultra-low power
Current work: - Mitrokhin et al. (2023): "Learning sensorimotor representations with spiking HDC" - Intel Loihi neuromorphic chip
3. Photonic Computing
Use light for hypervector operations:
Hypervector β Optical signal (wavelength/phase encoded)
Binding β Optical interference
Bundling β Beam combining
Advantages: - π Speed of light computation - βοΈ Minimal heat generation - π§ Massively parallel
Open problems: - How to handle negative values in analog hardware? - Can we train VSA models directly on neuromorphic chips? - What's the optimal hypervector representation for SNNs?
Your Research Opportunity¶
- Project idea: Simulate VSA on spiking neural network simulator (e.g., Brian2, NEST)
- Questions: What's the energy efficiency gain? Can SNNs learn VSA operations?
Research Direction 4: Dimensionality & Capacity¶
The Problem¶
Current VSA uses fixed dimensionality (typically 512-10,000): - Too low β poor capacity, high noise - Too high β wasteful computation and memory
Research Questions¶
- What is the theoretical minimum dimensionality for a given task?
- Can we adaptively adjust dimensionality during computation?
- Are there compressive encodings that preserve VSA properties at lower dimensions?
Approaches¶
1. Dimensionality Theory
Johnson-Lindenstrauss Lemma tells us random projections preserve distances.
Can we derive similar guarantees for VSA operations?
Conjecture: For N symbols with Ξ΅ error tolerance, minimum dimension:
Open problems: - Exact bounds for binding/bundling capacity - Trade-offs between dimensionality, accuracy, and computational cost
2. Dynamic Dimensionality
class AdaptiveDimensionalityVSA:
def __init__(self, dim_min=512, dim_max=10000):
self.current_dim = dim_min
def bind(self, a, b):
# Monitor reconstruction error
result = fhrr_bind(a, b, dim=self.current_dim)
error = self.estimate_error(result)
if error > threshold:
# Increase dimensionality
self.current_dim = min(self.current_dim * 2, self.dim_max)
return result
3. Compression Techniques
Apply dimensionality reduction while preserving structure:
# Full VSA at 10,000 dims
hv_full = encode_full(data)
# Compress to 512 dims using learned projection
hv_compressed = learned_projection(hv_full, target_dim=512)
# Can we still unbind/bundle?
Your Research Opportunity¶
- Project idea: Empirically measure capacity vs dimensionality for different VSA models
- Questions: Is there a phase transition in performance? Can we predict required dimensionality from task complexity?
Research Direction 5: Continual Learning¶
The Problem¶
Traditional ML suffers from catastrophic forgetting: - Train on task A β good performance - Train on task B β task A performance collapses
VSA has natural advantages: - Bundling is order-invariant (can add facts any time) - High-dimensional space has room for many concepts - No destructive updates
Research Question¶
Can VSA enable perfect continual learning?
Approaches¶
1. Incremental Bundling
class ContinualMemory:
def __init__(self, model):
self.model = model
self.knowledge = jnp.zeros(model.dim, dtype=jnp.complex64)
def learn_fact(self, fact_hv):
# Simply bundle new fact
self.knowledge = self.model.opset.bundle(self.knowledge, fact_hv)
# No forgetting!
def recall(self, query_hv):
return self.model.opset.bind(self.knowledge,
self.model.opset.inverse(query_hv))
2. Hierarchical Consolidation
Organize memories in tree structure:
Root (general knowledge)
/ | \
Math Science History β Domains
/\ /\ /\
... ... ... β Specific facts
3. Selective Forgetting
Unbundle old/irrelevant facts to make room:
# Remove outdated fact
outdated_fact_hv = memory["2020_covid_stats"]
knowledge = opset.unbundle(knowledge, outdated_fact_hv)
Open problems: - What's the capacity limit for bundled knowledge? - How to detect and correct interference? - Can we compress old knowledge (lossy but space-efficient)?
Your Research Opportunity¶
- Project idea: Build continual learning benchmark for VSA
- Questions: At what point does bundling degrade? Can we quantify forgetting rate?
Research Direction 6: Interpretability¶
The Problem¶
Deep learning is often a black box. VSA promises interpretability: - Unbind to extract components - Similarity to known concepts - Symbolic structure
But challenges remain: - High dimensions are hard to visualize - Approximate unbinding introduces noise - Resonator convergence may fail
Research Question¶
How can we make VSA reasoning fully transparent and interpretable?
Approaches¶
1. Visualization Tools
Dimensionality reduction for plotting:
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
# Encode many concepts
concepts = ["cat", "dog", "car", "tree", "mountain"]
hvs = [memory[c].vec for c in concepts]
# Reduce to 2D for plotting
hvs_2d = TSNE(n_components=2).fit_transform(hvs)
plt.scatter(hvs_2d[:, 0], hvs_2d[:, 1])
for i, concept in enumerate(concepts):
plt.annotate(concept, (hvs_2d[i, 0], hvs_2d[i, 1]))
plt.show()
2. Explanation Generation
def explain_hypervector(hv, memory, top_k=5):
"""
Explain what a hypervector represents by finding
most similar known concepts.
"""
similarities = []
for name, stored_hv in memory.items():
sim = cosine_similarity(hv, stored_hv.vec)
similarities.append((name, sim))
similarities.sort(key=lambda x: x[1], reverse=True)
print(f"This hypervector is most similar to:")
for name, sim in similarities[:top_k]:
print(f" - {name}: {sim:.3f}")
3. Causal Tracing
Track which components contribute to final answer:
# Query: "What is the capital of France?"
query_hv = encode_query("capital of France")
# Unbind to get answer
answer_hv = opset.bind(knowledge_base, opset.inverse(query_hv))
# Which facts in knowledge_base contributed most?
for fact_name, fact_hv in facts.items():
contribution = cosine_similarity(answer_hv, fact_hv)
print(f"Fact '{fact_name}' contribution: {contribution:.3f}")
Open problems: - Can we visualize high-dimensional binding structures? - How to present VSA reasoning to non-experts? - Quantifying uncertainty in VSA predictions?
Your Research Opportunity¶
- Project idea: Build interactive VSA debugger/visualizer
- Questions: Can users understand VSA reasoning? What visualizations are most helpful?
Open Problems Summary¶
Theoretical¶
- Capacity bounds: Exact theoretical limits for binding/bundling
- Optimal dimensionality: Principled way to choose dimension for a task
- Error propagation: How does noise accumulate in multi-hop reasoning?
- Universality: What computations can/cannot be expressed in VSA?
Algorithmic¶
- Learning: How to integrate gradient-based learning with VSA?
- Scaling: Efficient VSA for millions of symbols?
- Streaming data: Online learning with unbounded data?
- Multi-modality: Principled fusion of heterogeneous data?
Applied¶
- Benchmarks: Standard datasets for VSA evaluation?
- Real-world deployment: Production-ready VSA systems?
- Hybrid architectures: Best way to combine VSA + DNNs?
- Energy efficiency: How much better is VSA than DNNs on edge devices?
Contributing to VSAX¶
How to Get Involved¶
1. Report Issues
Found a bug or have a feature request? - GitHub Issues: https://github.com/anthropics/vsax/issues - Provide minimal reproducible example - Describe expected vs actual behavior
2. Contribute Code
- Fork the repository
- Create feature branch:
git checkout -b feature/my-encoder - Write tests for your code
- Submit pull request with clear description
Example contributions: - New encoder for your domain - Performance optimization - Additional examples/tutorials - Bug fixes
3. Share Your Research
- Publish papers using VSAX
- Share your custom encoders on GitHub
- Write blog posts about applications
- Present at conferences
4. Improve Documentation
- Fix typos or unclear explanations
- Add examples
- Expand API documentation
- Translate to other languages
Formulating Your Research Question¶
Template¶
Given [DOMAIN/PROBLEM],
can we [VSA TECHNIQUE/APPROACH]
to achieve [GOAL/METRIC]
better than [BASELINE]?
Examples¶
Example 1: Robotics
Given multi-modal sensor streams from a robot,
can we use SensorFusionEncoder + SSP
to achieve real-time localization and mapping
better than traditional SLAM algorithms?
Example 2: NLP
Given large-scale knowledge graphs,
can we use hierarchical VSA encoding + resonators
to achieve one-shot question answering
better than fine-tuned LLMs?
Example 3: Neuroscience
Given fMRI brain activity data,
can we decode mental states using VSA encoders
to achieve interpretable cognitive state classification
better than black-box DNNs?
Your Turn¶
What research question excites YOU?
Write it down using the template:
Given _______________________,
can we _______________________
to achieve __________________
better than _________________?
Resources for Further Learning¶
Key Papers¶
Foundational: 1. Plate (1995): "Holographic Reduced Representations" 2. Kanerva (2009): "Hyperdimensional Computing" 3. Gayler (2003): "Vector Symbolic Architectures"
Recent Surveys: 1. Kleyko et al. (2022): "Vector Symbolic Architectures as a Computing Framework" (IEEE Proceedings) 2. Schlegel et al. (2022): "A Comparison of Vector Symbolic Architectures"
Applications: 1. Imani et al. (2019): "A framework for collaborative learning in secure high-dimensional space" (Edge ML) 2. Ge & Parhi (2021): "Classification using hyperdimensional computing" (Efficient AI)
Conferences & Workshops¶
- NeurIPS: Neuro-symbolic AI workshops
- ICML: Efficient ML track
- NICE: Neuro-Inspired Computational Elements workshop
- CoRL: Conference on Robot Learning (VSA for control)
Online Communities¶
- VSAX GitHub Discussions: Ask questions, share projects
- Reddit r/MachineLearning: HDC/VSA threads
- Twitter #VectorSymbolicArchitectures
Self-Assessment¶
Before concluding, ensure you can:
- [ ] Identify at least 3 current research directions in VSA
- [ ] Explain challenges in learning-based VSA
- [ ] Describe opportunities for VSA + LLMs
- [ ] Understand neuromorphic hardware potential
- [ ] Recognize open theoretical and applied problems
- [ ] Formulate your own research question
- [ ] Know how to contribute to VSAX
Final Quiz¶
Question 1: What is the main advantage of learned basis vectors over random ones?
a) Faster computation b) Semantic structure in the representation space c) Lower memory usage d) Easier to implement
Answer
**b) Semantic structure in the representation space** Learned basis vectors can capture inherent semantic relationships (e.g., "cat" and "dog" are similar) rather than being randomly orthogonal. This could improve generalization and reasoning while preserving VSA's compositional properties.Question 2: Why is neuromorphic hardware promising for VSA?
a) VSA operations are naturally parallel and efficient in analog compute b) Neuromorphic chips are cheaper c) VSA doesn't work on traditional CPUs d) It's just a trend
Answer
**a) VSA operations are naturally parallel and efficient in analog compute** Binding and bundling are embarrassingly parallel element-wise operations. In-memory analog computing can perform these operations extremely fast and energy-efficiently by avoiding data movement and leveraging physical properties (current summation, resistance multiplication).Question 3: What is continual learning's main challenge that VSA might address?
a) Training speed b) Catastrophic forgetting c) Model size d) Data collection
Answer
**b) Catastrophic forgetting** Traditional neural networks overwrite old knowledge when learning new tasks. VSA's bundling operation is additive and non-destructive - new facts can be added without erasing old ones, potentially enabling perfect continual learning.Key Takeaways¶
β VSA research is vibrant and growing - opportunities across theory, algorithms, and applications β Learning-based VSA - integrating gradient learning with symbolic operations β VSA + LLMs - combining neural perception with symbolic reasoning β Neuromorphic hardware - orders of magnitude efficiency gains possible β Open problems abound - dimensionality, capacity, interpretability, continual learning β You can contribute! - VSAX welcomes encoders, optimizations, and applications
Course Complete!¶
Congratulations! You've completed the VSAX course covering:
- β Module 1: Foundational concepts (high dimensions, binding/bundling, three models)
- β Module 2: Core operations (FHRR, MAP, Binary, similarity, model selection)
- β Module 3: Encoders & applications (scalars, sequences, images, knowledge graphs, analogies)
- β Module 4: Advanced techniques (operators, SSP, hierarchical, multi-modal)
- β Module 5: Research & extensions (VFA, custom encoders, frontiers)
You are now equipped to: - Build VSA-powered applications - Design custom encoders for your domain - Contribute to VSA research - Push the boundaries of hyperdimensional computing
Where to Go From Here¶
Immediate Next Steps¶
- Build something! Apply VSAX to your research or project
- Share your work - Publish code, write blog posts, present at conferences
- Join the community - GitHub Discussions, contribute to VSAX
- Read papers - Dive deeper into topics that interest you
Advanced Topics (Beyond This Course)¶
- Quantum VSA: Using quantum superposition for hypervectors
- Biological VSA: Modeling neural circuits with VSA
- VSA for Causality: Encoding causal relationships
- Federated VSA: Distributed learning with privacy
Research Opportunities¶
Pick a research direction from this lesson and: 1. Read 3-5 key papers 2. Implement a proof-of-concept 3. Run experiments and analyze results 4. Write it up and share with the community!
Final Words¶
Vector Symbolic Architectures represent a paradigm shift in how we think about computation:
- Not neural networks (though compatible with them)
- Not classical symbolic AI (though shares symbolic properties)
- A unique fusion of continuous + discrete, distributed + compositional, learned + structured
The field is young, the problems are hard, and the opportunities are immense.
Welcome to the VSA research community!
We can't wait to see what you build.
Acknowledgments¶
This course was built on decades of research by pioneers: - Tony Plate (Holographic Reduced Representations) - Pentti Kanerva (Hyperdimensional Computing) - Ross Gayler (Vector Symbolic Architectures) - Many others who advanced the field
Thank you for learning with us.
Now go forth and compute in high dimensions!
References¶
Key Surveys: - Kleyko, D., et al. (2022). "Vector Symbolic Architectures as a Computing Framework for Nanoscale Hardware." Proceedings of the IEEE, 109(8), 1366-1397. - Schlegel, K., Neubert, P., & Protzel, P. (2022). "A Comparison of Vector Symbolic Architectures." Artificial Intelligence Review, 55, 4523-4555.
Foundational Works: - Plate, T. A. (1995). "Holographic Reduced Representations." IEEE Transactions on Neural Networks, 6(3), 623-641. - Kanerva, P. (2009). "Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors." Cognitive Computation, 1, 139-159.
Recent Applications: - Imani, M., et al. (2019). "A Framework for Collaborative Learning in Secure High-Dimensional Space." - Neubert, P., Schubert, S., & Protzel, P. (2019). "Learning Vector Symbolic Architectures for Reactive Robot Behaviours." - Mitrokhin, A., et al. (2023). "Learning sensorimotor representations with spiking HDC."
Neuromorphic Hardware: - Karunaratne, G., et al. (2020). "In-memory hyperdimensional computing." Nature Electronics, 3, 327-337. - Poduval, P., et al. (2021). "HDnn: Hyperdimensional Inference with Spiking Neural Networks."