Dynamic CSV Uploads in Java: Building Smart, Flexible Databases with Quarkus

Master the art of ingesting any CSV file into PostgreSQL dynamically using Quarkus, Hibernate Panache, and Jackson CSV — no rigid schemas required.

May 07, 2025

You know how it goes. You're focused on your work when suddenly a manager drops a pile of random CSV files onto your desk with the classic instruction: "Just get these into the database somehow." There’s no fixed structure, no guarantees about the columns, and of course, the deadline was yesterday. You need a solution that's flexible, fast, and production-ready.

Fortunately, with the right tools, Quarkus, Hibernate Panache, PostgreSQL, and Jackson CSV, you can build exactly what’s needed without losing your sanity.

In this article, you’ll build a dynamic CSV upload application that:

Accepts any CSV file.
Automatically creates the database schema.
Inserts the data safely.
Tracks metadata about each upload.
Lists the dynamic tables and their content

Minimal assumptions. Maximal flexibility. You’ll probably need roughly 30 minutes to build this yourself but you can also just check out the repository on my github for a quickstart.

Let’s get to work.

Project Setup

First, create a new project using the Quarkus CLI:

quarkus create app org.acme:csv-uploader --extension="quarkus-rest,quarkus-hibernate-orm-panache,quarkus-jdbc-postgresql,quarkus-smallrye-openapi,quarkus-jackson"
cd csv-uploader

You'll also need Jackson's CSV module. Add this to your pom.xml manually:

<dependency>
    <groupId>com.fasterxml.jackson.dataformat</groupId>
    <artifactId>jackson-dataformat-csv</artifactId>
</dependency>

Why these choices?

quarkus-rest: A lightweight, fast HTTP layer.
Hibernate Panache: Simplifies entity management.
PostgreSQL: A robust relational database.
SmallRye OpenAPI: Auto-generates API docs.
Jackson: For parsing JSON and CSV formats easily.

Database Configuration

In your src/main/resources/application.properties, configure PostgreSQL:

quarkus.datasource.db-kind=postgresql
quarkus.datasource.username=postgres
quarkus.datasource.password=postgres

quarkus.hibernate-orm.database.generation=update
quarkus.hibernate-orm.log.sql=true

This lets Hibernate create and update our permanent metadata table automatically. Dynamic tables will be created manually based on each CSV.

Metadata Entity: Tracking Uploads

We'll track every upload with a small Panache entity:

package org.acme.entity;

import io.quarkus.hibernate.orm.panache.PanacheEntity;
import jakarta.persistence.Entity;
import java.time.LocalDateTime;

@Entity
public class CsvUploadMetadata extends PanacheEntity {
    public String tableName;
    public LocalDateTime uploadTime;
    public int recordCount;
}

Why?

You want to know what got uploaded, when, and how many records.
Auditing matters, even for quick tools.

Parsing CSV Files Dynamically

CSV structures are unpredictable, so instead of mapping to a fixed Java class, we’ll parse them into simple key-value maps.

package org.acme.util;

import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import jakarta.enterprise.context.ApplicationScoped;
import org.jboss.resteasy.reactive.multipart.FileUpload;

import java.io.IOException;
import java.util.List;
import java.util.Map;

@ApplicationScoped
public class CsvReader {

    public List<Map<String, String>> parseCsv(FileUpload fileUpload) throws IOException {
        CsvMapper mapper = new CsvMapper();
        CsvSchema schema = CsvSchema.emptySchema().withHeader();
        MappingIterator<Map<String, String>> it = mapper.readerFor(Map.class)
                .with(schema)
                .readValues(fileUpload.uploadedFile().toFile());
        return it.readAll();
    }
}

Each row is now a Map<String, String>, perfect for dynamic processing.

Schema Validation and Type Inference

Before creating a database table, we’ll validate the header and infer column types based on sample data.

Here’s a basic type inference utility:

package org.acme.util;

import java.util.regex.Pattern;

public class TypeInference {

    private static final Pattern NUMBER_PATTERN = Pattern.compile("-?\\d+(\\.\\d+)?");

    public static String inferType(String value) {
        if (value == null || value.isBlank()) {
            return "TEXT"; // Default fallback
        }
        if (NUMBER_PATTERN.matcher(value).matches()) {
            return value.contains(".") ? "DOUBLE PRECISION" : "INTEGER";
        }
        return "TEXT";
    }
}

Bottom line:

Bad headers are rejected.
Types are assigned (INTEGER, DOUBLE PRECISION, or fallback TEXT).

Creating Dynamic Tables and Inserting Data

Now the heart of the system: dynamic table creation and insertion.

Here’s the CsvUploadService:

package org.acme.service;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import jakarta.persistence.EntityManager;
import jakarta.transaction.Transactional;
import org.acme.entity.CsvUploadMetadata;
import org.acme.util.CsvReader;
import org.acme.util.TypeInference;
import org.jboss.resteasy.reactive.multipart.FileUpload;

import java.io.IOException;
import java.time.LocalDateTime;
import java.util.*;
import java.util.stream.Collectors;

@ApplicationScoped
public class CsvUploadService {

    @Inject
    EntityManager entityManager;

    @Inject
    CsvReader csvReader;

    @Transactional
    public CsvUploadMetadata uploadCsv(FileUpload fileUpload) throws IOException {
        List<Map<String, String>> rows = csvReader.parseCsv(fileUpload);

        if (rows.isEmpty()) {
            throw new RuntimeException("Empty CSV file");
        }

        String tableName = "csv_" + UUID.randomUUID().toString().replace("-", "");

        Map<String, String> sampleRow = rows.get(0);
        Map<String, String> columnTypes = new HashMap<>();
        sampleRow.forEach((key, value) -> {
            if (key == null || key.trim().isEmpty()) {
                throw new RuntimeException("Invalid column name detected in CSV header.");
            }
            columnTypes.put(key, TypeInference.inferType(value));
        });

        String createTableSql = "CREATE TABLE " + tableName + " (id SERIAL PRIMARY KEY, " +
                columnTypes.entrySet().stream()
                        .map(entry -> "\"" + entry.getKey() + "\" " + entry.getValue())
                        .collect(Collectors.joining(", "))
                + ")";
        entityManager.createNativeQuery(createTableSql).executeUpdate();

        for (Map<String, String> row : rows) {
            String columns = row.keySet().stream()
                    .map(col -> "\"" + col + "\"")
                    .collect(Collectors.joining(", "));

            String values = row.values().stream()
                    .map(value -> value == null ? "NULL" : "'" + value.replace("'", "''") + "'")
                    .collect(Collectors.joining(", "));

            String insertSql = "INSERT INTO " + tableName + " (" + columns + ") VALUES (" + values + ")";
            entityManager.createNativeQuery(insertSql).executeUpdate();
        }

        CsvUploadMetadata metadata = new CsvUploadMetadata();
        metadata.tableName = tableName;
        metadata.uploadTime = LocalDateTime.now();
        metadata.recordCount = rows.size();
        metadata.persist();

        return metadata;
    }
}

Notes:

Table names are random UUIDs to avoid conflicts.
SQL injection risks are mitigated because columns are controlled.
Data is inserted carefully, escaping single quotes.

REST API: Upload Endpoint

Simple and clean, here’s your resource:

package org.acme.resource;

import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import org.acme.entity.CsvUploadMetadata;
import org.acme.service.CsvUploadService;
import org.jboss.resteasy.reactive.MultipartForm;
import org.jboss.resteasy.reactive.multipart.FileUpload;
import org.eclipse.microprofile.openapi.annotations.Operation;
import org.eclipse.microprofile.openapi.annotations.responses.APIResponse;

@Path("/csv")
public class CsvUploadResource {

    @Inject
    CsvUploadService csvUploadService;

    @POST
    @Consumes(MediaType.MULTIPART_FORM_DATA)
    @Produces(MediaType.APPLICATION_JSON)
    @Operation(summary = "Upload a CSV and create a dynamic table", description = "Uploads a CSV file, stores it in a dynamically created table, and returns metadata.")
    @APIResponse(responseCode = "200", description = "Upload successful")
    public CsvUploadMetadata uploadCsv(@MultipartForm FileUpload fileUpload) throws Exception {
        return csvUploadService.uploadCsv(fileUpload);
    }
}

Thanks to SmallRye OpenAPI, this endpoint will be automatically visible at /q/swagger-ui/ and you can use it to upload a test.csv like the following:

first_name,last_name,email
John,Doe,john.doe@example.com
Jane,Smith,jane.smith@example.com

I have just added it to the src/main/resources folder.

The CSV Table Viewer Endpoint

Once you can upload random CSVs into dynamic tables, the next obvious step is being able to see what’s inside.
That’s exactly what the CsvShowResource class is built for: a clean, secure, and JSON-friendly way to view the uploaded data.

Let’s walk through how it works.

Resource Structure: `/csv-show`

The CsvShowResource class exposes all viewer functionality under a dedicated path:

@Path("/csv-show")
@Produces(MediaType.APPLICATION_JSON)
public class CsvShowResource {
    @Inject
    EntityManager entityManager;
}

Key points:

Every endpoint sits under /csv-show.
Every response is clean JSON.
It uses JPA’s EntityManager directly for dynamic SQL — because we’re dealing with tables not tied to static entities.

No unnecessary abstractions here. We keep it simple and explicit.

Listing Uploaded Tables

You probably want to see which tables exist first. Here’s the endpoint:

@GET
@Path("/tables")
public List<Map<String, Object>> listUploadedTables() {
    return CsvUploadMetadata.<CsvUploadMetadata>listAll().stream()
        .map(metadata -> Map.<String, Object>of(
            "uid", metadata.tableName,
            "uploadTime", metadata.uploadTime,
            "recordCount", metadata.recordCount
        ))
        .toList();
}

What it does:

Queries all metadata entries using Panache’s listAll().
Maps each metadata record into a small JSON object.
Keeps the output clean: table name, upload time, record count.

Example output:

[
  {
    "uid": "csv_78fb63ed80ea4931b297bd9ed7ce49d7",
    "uploadTime": "2024-03-21T10:30:00",
    "recordCount": 42
  }
]

No extra noise>: Just the essentials you need to know what’s available.

Viewing Table Content

When you want to look inside a dynamic table, this endpoint handles it:

@GET
@Path("/table/{tableName}")
@Transactional
@SuppressWarnings("unchecked")
public List<Map<String, Object>> showTableContent(@PathParam("tableName") String tableName) {
    if (!isTableAllowed(tableName)) {
        throw new IllegalArgumentException("Table not found");
    }

    String sql = "SELECT * FROM " + tableName;
    List<Object[]> rows = (List<Object[]>) entityManager
        .createNativeQuery(sql, Object[].class)
        .getResultList();
    
    List<String> columnNames = entityManager.createNativeQuery(
        "SELECT column_name FROM information_schema.columns WHERE table_name = '" + tableName + "' ORDER BY ordinal_position"
    ).getResultList();

    return mapResults(rows, columnNames);
}

What’s happening here:

Security first: It checks if the requested table is in CsvUploadMetadata.
Data retrieval: Native SQL fetches all rows from the table.
Schema discovery: Column names are queried dynamically from PostgreSQL’s information_schema.
Result mapping: Rows are paired with their corresponding column names into JSON objects.

The @Transactional annotation ensures everything runs smoothly inside a managed persistence context.

Security: Only Allow Valid Tables

Security is simple but critical: No random table names allowed.

private boolean isTableAllowed(String tableName) {
    return CsvUploadMetadata.find("tableName", tableName).firstResult() != null;
}

This check:

Verifies the table name exists in our metadata.
Prevents SQL injection or accidental access to unrelated database tables.

You don’t want users poking around your production schema, right?

Mapping the Results

Finally, once the data and columns are available, we stitch them together:

private List<Map<String, Object>> mapResults(List<Object[]> rows, List<String> columnNames) {
    return rows.stream().map(row -> {
        Map<String, Object> map = new java.util.HashMap<>();
        for (int i = 0; i < row.length; i++) {
            map.put(columnNames.get(i), row[i]);
        }
        return map;
    }).toList();
}

How it works:

Every database row (Object[]) is zipped with the column names.
Builds a proper JSON object per row.
Streams everything efficiently.

This keeps your response nicely structured, no matter how wild the original CSV was.

Other Approaches: Beyond Dynamic Tables

Creating dynamic tables isn’t the only way to handle unpredictable data.

One alternative is using a document store like MongoDB. Document stores naturally accept varying schemas, making them a great fit for unknown CSV structures. You lose some SQL querying power, but gain a lot of flexibility.

Another option is using JSON blob fields inside PostgreSQL. Instead of creating tables, you could store each CSV row as a JSON document inside a JSONB field. Querying individual fields becomes trickier unless you carefully index them, but it’s simpler for ingestion.

Finally, dedicated ingestion platforms like Apache NiFi or Debezium can automate and monitor complex data flows. But that's heavy machinery for a simple upload tool.

The dynamic-table model we chose keeps everything SQL-friendly, lightweight, and manageable, which fits most internal app needs.

Real-World Considerations

A few important points before you unleash this:

Schema Evolution: Decide what happens when new CSVs have extra/missing fields.
Big Files: For large CSVs, switch from row-by-row inserts to batch operations or PostgreSQL’s COPY feature.
Security: Use a restricted DB user to limit permissions.
Operational Management: Clean up old tables regularly if you expect hundreds of uploads.
Query Performance: Consider adding indexes if the tables are actively queried.

Conclusion

You just built a flexible, powerful dynamic CSV ingestion system with:

Smart type inference.
Safe schema validation.
Automated database handling.
Quarkus performance under the hood.

With only a few classes and clear patterns, you now have the foundation for admin tools, data sandboxes, or internal ingestion APIs.

And thanks to Quarkus, it’s fast, efficient, and fun to build. Make sure to try it out yourself and learn more about Quarkus.