Protols is a Language Server Protocol (LSP) implementation for Protocol Buffers that I wrote in Rust. In this post I will talk about why I built it and some of the interesting bits from the implementation.
Why
At work, we use a lot of protobuf files. While protobuf is great for defining APIs and data structures, navigating between dozens (sometimes hundreds) of .proto files was painful. I was constantly running grep or using Vim’s search to jump between message definitions, enum declarations, and imports scattered across packages.
I looked for a decent Language Server for protobuf. At the time, the options were either incomplete, unmaintained, or didn’t support things like go-to-definition across package boundaries. So I decided to write one.
Getting started
I had never built a Language Server before, but I had worked with LSP from the client side and had a good understanding of the spec from my time working on cpeditor. I also had enough Rust experience to be comfortable with it, so that was the obvious language choice.
The idea is simple enough: parse protobuf files, understand their structure, and provide code intelligence. The devil is in the details.
Tree-Sitter
I went with Tree-sitter for parsing. It gives you incremental parsing with solid error recovery, which is exactly what you need when users are actively typing incomplete code. For the LSP framework itself, I used async-lsp.
I know tree-sitter has a query system, but I wasn’t aware of it when I started. I ended up implementing all the tree walking by hand with recursive traversals:
pub fn find_all_nodes(&self, filter: fn(&Node) -> bool) -> Vec<Node> {
let mut result = Vec::new();
self.visit_nodes(self.tree.root_node(), &filter, &mut result);
result
}
fn visit_nodes(&self, node: Node, filter: &fn(&Node) -> bool, result: &mut Vec<Node>) {
if filter(&node) {
result.push(node);
}
for child in node.children(&mut node.walk()) {
self.visit_nodes(child, filter, result);
}
}
All those tree algorithms from CS courses finally found a practical use. Symbol collection, scope resolution, reference finding – all built on top of recursive AST traversals.
Multi-file state
The most interesting challenge was managing state across files. Protobuf files import from each other, creating dependency graphs. The LSP needs to parse all files in a workspace, resolve imports, build a complete symbol table, keep everything in sync when files change, and provide diagnostics across file boundaries.
pub struct ProtoLanguageState {
documents: Arc<RwLock<HashMap<Url, String>>>,
trees: Arc<RwLock<HashMap<Url, ParsedTree>>>,
parser: Arc<Mutex<ProtoParser>>,
parsed_workspaces: Arc<RwLock<HashSet<String>>>,
protoc_diagnostics: Arc<Mutex<ProtocDiagnostics>>,
}
Import resolution was the tricky part. When you parse a file, you need to recursively parse all its dependencies. You need to handle circular dependencies without looping, and avoid re-parsing files that haven’t changed.
Features
Protols supports:
- Auto-completion: Messages, enums, and keywords within the current package
- Diagnostics: Tree-sitter syntax errors combined with
protocvalidation - Go to Definition: Across package boundaries and imports
- Hover: Documentation and type information
- Document / Workspace Symbols: Navigable outline of file and workspace structure
- Find References: All usages of types and fields
- Rename: Across the entire codebase
- Formatting: Via
clang-format
Rename was probably the most involved – it needs to find every reference to a symbol across potentially dozens of files and update them atomically.
Current state
Protols does what I need it to do day-to-day, so I’m not actively adding major features. It has over 100 stars on GitHub, is published on crates.io, has CI/CD, and even got a VS Code extension through community contributions.
The protobuf LSP ecosystem has also grown since I started. There are now several other implementations available, which is good to see.
If you work with protobuf files, you can install it with:
cargo install protols
Neovim setup:
require'lspconfig'.protols.setup{}