how jottie's on-device ai actually works
february 2026
people keep asking me how jottie does semantic search without an internet connection. so here's the technical breakdown.
embeddings 101
first, quick primer on how semantic search works. when you type a note, we need to turn the words into numbers. specifically, a big list of numbers called an embedding vector.
the magic is that similar concepts end up with similar numbers. so "dog" and "puppy" and "golden retriever" all cluster together in this number space. same with "meeting notes" and "call summary" and "discussion recap."
when you search for "pet," we turn your search into numbers and find notes with similar numbers. that's why you get results about dogs and cats even though you searched for "pet."
nlembedding
apple ships a framework called natural language that includes nlembedding. this is the core of how jottie works on ios.
when you save a note, we pass it through nlembedding to get a vector. that vector gets stored locally in a sqlite database. when you search, we embed your query and do a cosine similarity search against all your stored vectors.
all of this happens on your device. the models are already on your phone, shipped as part of ios. no download required. no api call. no internet.
the model quality is surprisingly good. it's not as powerful as what you'd get from openai or anthropic, but for note-taking use cases it's more than enough. you're not asking it to write poetry, you're asking it to find related notes.
auto-tagging with natural language
beyond embeddings, the natural language framework can extract entities from text. people, places, organizations, dates.
so when you write "met with sarah about the q3 launch at blue bottle on monday," jottie can automatically tag it with sarah (matched to your contacts), q3 launch (project), blue bottle (location), and monday (resolved to the actual date).
this happens as you type. no cloud processing. no delay waiting for an api response.
apple foundation models
with ios 26, apple is shipping foundation models that run on-device. these are small language models trained for specific tasks like summarization, categorization, and text generation.
jottie uses these for things like suggesting note categories, generating quick summaries, and powering some of the more complex search queries.
again - all on device. apple designed these models to be efficient enough to run on an iphone without destroying your battery.
the tradeoffs
i'm not going to pretend local inference is as good as cloud. it's not. there are tradeoffs.
- embedding quality is lower than gemini or openai
- context windows are smaller
- complex queries take longer
- you're limited to what apple ships
but here's the thing: for note-taking, these tradeoffs are worth it. you're not building a chatbot. you're organizing your own thoughts. the model doesn't need to know everything, it just needs to understand your notes well enough to help you find them later.
why this matters
local inference is the future. i genuinely believe that. the models are getting smaller and more efficient. phones are getting more powerful. the gap between cloud and on-device is shrinking.
more importantly, local inference means privacy by default. there's no data to leak because the data never leaves. there's no server to breach because there is no server.
your notes get the benefits of ai without the privacy tradeoffs. that's the whole point.
try it yourself
if you want to see this in action, download jottie from the app store. you don't need an account. you don't need to connect to anything. just start writing notes and search for them.
it feels like magic the first time you search for a concept and find notes you forgot you wrote. but really, it's just math running on your phone.