From the Mailbag: Embedding Content

By Jordan , 1 January, 2026
a robot looking at a vector database

I was recently pleased to find a letter come through via the contact page on my own website.  It reads:

Hi there, I was at Badcamp and am working through the tutorial to add an AI Chatbot. I noticed that in my embeddings strategy it only lets me choose one Main Content field. Would you recommend I use the block body, or the content body field? Is this something I can change to allow more than one field? What does contextual content mean if I use that for the rest?

Hi, Chris!  I'm glad you enjoyed my talk.  Embedding is important, and I glossed over it in my presentation.  Here are the answers to your questions:

Would you recommend I use the block body, or the content body field?

Short answer: content body field.

Longer answer: It depends on your real content!  If you're using the Umami demo content, like I did, then you might want to actually use the instructions, rather than the body text, as your "main content."

What is main content? Remember my slide depicting the vector database?

a robot looking at a vector database

Vector dbs are not like normal dbs you might have used before; they measure conceptual relationships between different words or short phrases.  You need to "chop up" your content in order to feed it into the vector db, and luckily Drupal is good at that; all content in Drupal is already stored in a relational database, which means relationships between different things are already established.  When you use the search api module, it automatically takes advantage of this ability.

Is this something I can change to allow more than one field?

Short answer: No.

Longer answer: No - and you don't want to.  When you designate a single field as the "primary content," you're telling the embedding service (in my case, Pinecone) to focus its power on that field.  Whatever that field says, you should draw relationships from that.

What does contextual content mean if I use that for the rest?

Short answer: Contextual content is good, you want to flag as many fields as you can that way.

Longer answer: Still the same, but be careful; it really depends on your actual use case.  You don't want to index everything in a vector db, only the things that are logically relevant to your use case.  If you're using the Umami demo content, then your use case is recipes, and creating a chatbot that can talk about recipes.  In that case, you should ignore the "created by" field, for example, since your chatbot doesn't care who created a recipe.  (Or does it...?)

Video file

Comments

Filtered HTML

  • Web page addresses and email addresses turn into links automatically.
  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.