From Strings to Semantics: Comparing Text with Java, Quarkus, and Embeddings
Learn how to build an AI-powered text similarity service using Quarkus, LangChain4j, and local embedding models.
As developers, we’re used to working with strings: parsing them, storing them, comparing them. But strings are just characters. "cat"
and "kitten"
may mean nearly the same thing to a human, but to a computer, they’re as different as "cat"
and "spaceship"
.
What if your Java application could understand what text means? What if you could compare texts based on their semantic similarity, not just their literal characters?
That’s where text embeddings come in. Embeddings are high-dimensional vectors (arrays of floats) that represent the meaning of a word, sentence, or document. Texts with similar meanings are mapped to vectors that are close together in vector space. This opens up a whole new world: semantic search, recommendation systems, clustering, classification, and more.
In this tutorial, you’ll build a Quarkus application that:
Converts text to embeddings using LangChain4j and Ollama
Compares those embeddings using cosine similarity and Euclidean distance
Visualizes similarity scores in a simple web UI with Qute
You can find the full source code of the example in my Github repository.
What Exactly Are Embeddings?
Let’s say you enter the phrase "How’s the weather today?"
and "Is it raining outside?"
into your app. Even though these two sentences use different words, they mean something similar. Traditional string comparison techniques (like equals, regex, or Levenshtein distance) won’t capture that. But embeddings will.
Think of an embedding as a list of 768 numbers (or 384, 1024, etc., depending on the model). This vector lives in a multi-dimensional space where nearby vectors represent semantically similar texts.
"Quarkus is fast" → [0.21, -0.01, ..., 0.76] → Vector A
"Quarkus is performant" → [0.20, -0.03, ..., 0.74] → Vector B
CosineSimilarity(A, B) ≈ 0.92 → High similarity
All this happens thanks to models trained to understand the meaning and context of language. Fortunately, you don’t need to build the model yourself, you just need the right API and setup.
Bootstrapping the Project
Let’s create a Quarkus application with the necessary extensions:
mvn io.quarkus.platform:quarkus-maven-plugin:3.21.4:create \
-DprojectGroupId=org.acme \
-DprojectArtifactId=embedding-tutorial \
-Dextensions="rest-jackson,langchain4j-ollama,qute"
cd embedding-tutorial
You’ll use:
quarkus-langchain4j-ollama
: for generating embeddings using a local modelquarkus-qute
: to render the HTML outputrest-jackson
: to handle JSON and HTTP endpoints
Now open src/main/resources/application.properties
and configure the Ollama embedding model:
quarkus.langchain4j.ollama.base-url=http://localhost:11434
quarkus.langchain4j.ollama.timeout=60s
quarkus.langchain4j.ollama.embedding-model.model-name=nomic-embed-text
Make sure you have Ollama installed and running locally. Pull down the model with:
ollama pull nomic-embed-text
This model produces 768-dimensional vectors for any input text and is a great fit for general-purpose embedding.
Note: Parameters like
temperature
ortop_k
, common in text generation, don’t apply here. Embeddings are deterministic which means that the same input always produces the same vector.
If you can not install Ollama or don’t want to, you can also use the Quarkus Dev Service with a local ollama container. Learn more about it!
A Quick Look Under the Hood
Here’s what embedding generation really looks like:
Tokenization: Text is split into tokens, like
["Quarkus", "is", "fun", "!"]
Token IDs: Each token maps to an integer
Initial Vectors: The model has a table mapping token IDs to float[] vectors
Contextualization: These vectors are adjusted based on surrounding tokens
Pooling: All token vectors are combined (e.g., averaged) into one final embedding vector
This final vector is what we use to compare texts.
Why Embeddings Matter
Embeddings play a crucial role in many real-world AI applications because they allow you to represent the meaning of data in a compact, machine-readable format. For instance, in semantic search, embeddings let you rank results not by keyword matches but by how closely their meanings align with the user’s query. They’re also foundational in Retrieval-Augmented Generation (RAG), where they help large language models tap into private or domain-specific data to generate more accurate answers. Embeddings make it possible to automatically cluster similar documents or support tickets, which is useful in customer service and knowledge management. In recommendation systems, they help surface related products, articles, or media based on user interests or item similarity. And because embeddings encode rich features, they’re often used in classification and anomaly detection tasks—helping models distinguish between normal and unusual patterns in everything from emails to logs to transaction histories.
Writing the Similarity Endpoint
Now let’s implement a REST endpoint to calculate embedding similarity.
Create SimilarityResource.java
in src/main/java/org/acme/
:
@Path("/")
public class SimilarityResource {
private static final Logger log = Logger.getLogger(SimilarityResource.class); // Optional logging
@Inject
EmbeddingModel embeddingModel;
@Inject
Template similarity;
@GET
@Path("/similarity")
@Produces(MediaType.TEXT_HTML)
public String get(
@QueryParam("text1") String text1,
@QueryParam("text2") String text2) {
if (isBlank(text1) || isBlank(text2)) {
log.warn("Missing input parameter for similarity check.");
return similarity.data("error", "Please provide both text1 and text2 parameters.").render();
}
try {
log.infof("Calculating similarity for text1: '%s' and text2: '%s'", text1, text2);
// --- Embedding Generation ---
Response<Embedding> embedding1 = embeddingModel.embed(text1);
Response<Embedding> embedding2 = embeddingModel.embed(text2);
float[] vector1 = embedding1.content().vector();
float[] vector2 = embedding2.content().vector();
// --- Calculations ---
double cosSimilarity = CosineSimilarity.between(embedding1.content(), embedding2.content());
float[] normVec1 = normalize(vector1);
float[] normVec2 = normalize(vector2);
double eucDistance = euclideanDistance(normVec1, normVec2);
int dimensions = (vector1 != null) ? vector1.length : 0;
// --- Prepare data for the template ---
double clampedCosSimilarity = Math.max(0.0, Math.min(1.0, cosSimilarity));
int similarityPercentage = (int) (clampedCosSimilarity * 100);
// Format numbers to 4 decimal places
String formattedCosSimilarity = String.format("%.4f", cosSimilarity);
String formattedEucDistance = String.format("%.4f", eucDistance);
// Format partial vectors for display (show first 10 dimensions)
String vec1StartStr = formatVector(vector1, 10);
String vec2StartStr = formatVector(vector2, 10);
// Use a Map to pass data to the template
Map<String, Object> data = new HashMap<>();
data.put("text1", text1);
data.put("text2", text2);
data.put("dimensions", dimensions);
data.put("cosSimilarity", formattedCosSimilarity);
data.put("similarityPercentage", similarityPercentage);
data.put("eucDistance", formattedEucDistance);
data.put("vector1Start", vec1StartStr);
data.put("vector2Start", vec2StartStr);
data.put("error", null);
return similarity.data(data).render();
} catch (Exception e) {
log.error("Error calculating similarity", e);
return similarity.data("error", "An error occurred processing the request: " + e.getMessage()).render();
}
}
// normalize, formatVector, euclideanDistance methods omitted for brevity...
}
This endpoint:
Embeds both input texts
Calculates cosine similarity (angle between vectors)
Normalizes both vectors and computes Euclidean distance
Passes results to the Qute template for rendering
Designing the Qute Template
Create the file src/main/resources/templates/similarity.html
:
<!DOCTYPE html>
<html>
<head>
<title>Embedding Similarity</title>
<!-- Styles skipped for brevity -->
</head>
<body>
<h2>Embedding Similarity Analysis</h2>
{#if error}
<p class="error">{error}</p>
{#else}
<h3>Input Text 1:</h3>
<pre>{text1}</pre>
<h3>Input Text 2:</h3>
<pre>{text2}</pre>
<hr>
<h3>Results:</h3>
<p><strong>Vector Dimensions:</strong> {dimensions}</p>
<div>
<p style="display: inline-block; margin-bottom: 15px;">
<strong>Cosine Similarity:</strong> {cosSimilarity} (Range: [-1, 1], closer to 1 is more
similar)
</p>
<div class="similarity-bar-container" title="Cosine Similarity: {cosSimilarity}">
<div class="similarity-bar" style="width: {similarityPercentage}%;">
{similarityPercentage}%
</div>
</div>
</div>
<p><strong>Euclidean Distance (Normalized):</strong> {eucDistance} (Range: [0, 2], closer to 0 is more
similar)</p>
{#if vector1Start}
<hr>
<h4>Embedding Vectors (First 10 Dimensions):</h4>
<p>This is the actual start of the numerical vector (embedding) generated for each text. The full vector has
{dimensions} dimensions.</p>
<p>Vector for Text 1:</p>
<pre>{vector1Start}</pre>
<p>Vector for Text 2:</p>
<pre>{vector2Start}</pre>
{/if}
{/if}
</body>
</html>
Run and See It in Action
Start the Quarkus dev server:
./mvnw quarkus:dev
Then open your browser and test the endpoint:
http://localhost:8080/similarity?text1=Quarkus+is+fast&text2=I+like+fast+Java
http://localhost:8080/similarity?text1=An+apple+is+a+fruit&text2=A+truck+is+a+vehicle
You’ll see the original texts, similarity scores, vector dimensions, and a nice bar indicating how close the meanings are.
Wait—Shouldn’t 0.5764 Mean More Than 57%?
If you tried this comparison:
http://localhost:8080/similarity?text1=Quarkus+is+fast&text2=I+like+fast+Java
You probably noticed the similarity score reported as 0.5764
, which visually becomes a 57% similarity bar. But hang on—doesn’t a cosine similarity of 0.57
imply more than just 57% similarity?
The Cosine Similarity Range
Cosine similarity is theoretically defined over the range [-1, 1]
:
1.0 → Vectors point in the same direction → Perfectly similar
0.0 → Vectors are orthogonal → Unrelated
-1.0 → Vectors point in opposite directions → Perfectly dissimilar
So yes, technically, 0.5764 lies well above halfway between opposite and identical. That could suggest something like 75–78% similarity, depending on how you map the range.
What the Code Actually Does
Let’s revisit this line in the code:
double clampedCosSimilarity = Math.max(0.0, Math.min(1.0, cosSimilarity));
int similarityPercentage = (int)(clampedCosSimilarity * 100);
This calculation clamps the similarity score into the [0.0, 1.0]
range and scales it directly to [0%, 100%]
. Any score below zero is treated as 0%. Any score above 1.0 (which shouldn’t happen) is capped at 100%.
This means:
cosSimilarity = 0.5764
→
clampedCosSimilarity = 0.5764
→
similarityPercentage = 57%
Why This Mapping?
This approach is deliberate, and fairly common. Here’s why:
Most real-world embeddings rarely produce negative cosine similarity unless the texts are deliberately crafted to be contradictory.
Progress bars are more intuitive when 0% means “not similar at all.” Negative values, while mathematically meaningful, often just clutter the UX.
It aligns with user expectations: high score = more green bar.
What If You Want a Full [-1, 1]
Mapping?
That’s valid too! If you want to show the full cosine similarity range linearly as [0%, 100%]
, you could use:
double scaledScore = (cosSimilarity + 1.0) / 2.0;
int similarityPercentageLinear = (int)(Math.max(0.0, Math.min(1.0, scaledScore)) * 100);
With this, your 0.5764
score becomes:
(0.5764 + 1.0) / 2.0 = 0.7882
→
78% similarity
Which arguably feels more “correct” for mathematical purists.
Which Is Better?
There’s no one right answer—it depends on your goals:
Clamped 0–1 range: Great for intuitive UI. Easy to interpret. Ideal for general users.
Linear -1 to 1 mapping: More mathematically faithful. Shows “neutral” (orthogonal) as 50%.
In this tutorial, we’re sticking with the clamped [0, 1]
mapping for simplicity and visual clarity. But now you know the tradeoffs—and can adjust the code to suit your needs.
Conclusion and Next Steps
Congratulations! You just built a working semantic similarity service in Java using Quarkus, LangChain4j, and a local Ollama model. Along the way, you learned:
What embeddings are and how they’re computed
How to compare them using cosine similarity and Euclidean distance
How to visualize the result using Qute
What’s next?
Try other models like
all-minilm
,sentence-transformers
, orbge-base
Integrate a vector store like Chroma or Weaviate to persist and query many embeddings
Extend this into a RAG app by combining embeddings with a chat model
The beauty of this approach? It’s just Java. No need to jump languages or tooling to explore cutting-edge AI.
Learn Even More
Curious to go further with Quarkus and LangChain4j? Here are some resources to help you dive deeper:
Quarkus + LangChain4j Workshop – A hands-on workshop that walks you through building AI-infused Java applications using Quarkus, LangChain4j, and local LLMs.
Quarkus LangChain4j Extension Documentation – Official docs for integrating LangChain4j with Ollama in Quarkus projects.
LangChain4j GitHub Repository – Explore the source code, browse examples, and learn about advanced features like chat memory, RAG, and toolchains.
These links will help you expand on what you’ve built today; whether you’re looking to add chat interfaces, vector databases, or custom pipelines.