In the Docker era, it's easy to scale this worker across my entire cluster.
If you already have access to this infrastructure, then use it. Link your Kafka libraries in some kind of minimal container with health checks and what not, and for the most part this works great. Adding a Kafka client dependency + database dependency is all you really need, right?
If you are not using Spark, Flink, etc., you will need to handle Kafka errors, try again, offset and process more closely with your code, and not let the infrastructure handle them for you.
I will add that if you want to interact with Kafka + Database, check out the Kafka Connect API. Existing solutions already exist for JDBC, Mongo, Couchbase, Cassandra, etc.
If you need more complete computing power, I would go for Kafka Streams instead of separately supporting the Spark cluster, and so that "just Kafka"
Create a spark cluster
Suppose you do not want to support this, or rather, you cannot choose between YARN, Mesos, Kubernetes, or Standalone. And if you're using the first three, it might be worth a look at how Docker works.
You are absolutely right that this is an additional overhead, so I find everything depending on what you have (for example, an existing Hadoop / YARN cluster with free space resources), or what you are willing to support domestically (or pay for service providers, for example, Kafka & Databricks in some hosted solution).
In addition, Spark does not work with the latest Kafka client library (to version 2.4.0, upgraded to Kafka 2.0, I believe), so you will need to determine if this is a selling point.
For real-life streaming libraries, not Spark, Apache Beam, or Flink packages, they will probably allow you to do the same types of workloads against Kafka
In general, in order to scale the producer / consumer, you need some kind of resource planner. Installing Spark may not be easy for some, but knowing how to use it efficiently and configure it for related resources may be