Question: SBT: How to package an instance of a class as a JAR?

Question

SBT: How to package an instance of a class as a JAR?

Answers 2
Added at 2017-11-08 17:11
Tags
Question

I have code which essentially looks like this:

class FoodTrainer(images: S3Path) { // data is >100GB file living in S3
  def train(): FoodClassifier       // Very expensive - takes ~5 hours!
}

class FoodClassifier {          // Light-weight API class
  def isHotDog(input: Image): Boolean
}

I want to at JAR-assembly (sbt assembly) time, invoke val classifier = new FoodTrainer(s3Dir).train() and publish the JAR which has the classifier instance instantly available to downstream library users.

What is the easiest way to do this? What are some established paradigms for this? I know its a fairly common idiom in ML projects to publish trained models e.g. http://nlp.stanford.edu/software/stanford-corenlp-models-current.jar

How do I do this using sbt assembly where I do not have to check in a large model class or data file into my version control?

Answers to

SBT: How to package an instance of a class as a JAR?

nr: #1 dodano: 2017-11-08 17:11

Here's an idea, throw your model in a resource folder that get's added into the jar assembly. I think all jars get distributed with your model if its in that folder. Lmk how it goes, cheers!

Check this out for reading from resource:

https://www.mkyong.com/java/java-read-a-file-from-resources-folder/

It's in Java but you can still use the api in Scala.

nr: #2 dodano: 2017-11-08 19:11

You should serialize the data which results from training into its own file. You can then package this data file in your JAR. Your production code opens the file and reads it rather than run the training algorithm.

Source Show
◀ Wstecz