DZone Research Report: A look at our developer audience, their tech stacks, and topics and tools they're exploring.
Getting Started With Large Language Models: A guide for both novices and seasoned practitioners to unlock the power of language models.
Also known as the build stage of the SDLC, coding focuses on the writing and programming of a system. The Zones in this category take a hands-on approach to equip developers with the knowledge about frameworks, tools, and languages that they can tailor to their own build needs.
A framework is a collection of code that is leveraged in the development process by providing ready-made components. Through the use of frameworks, architectural patterns and structures are created, which help speed up the development process. This Zone contains helpful resources for developers to learn about and further explore popular frameworks such as the Spring framework, Drupal, Angular, Eclipse, and more.
Java is an object-oriented programming language that allows engineers to produce software for multiple platforms. Our resources in this Zone are designed to help engineers with Java program development, Java SDKs, compilers, interpreters, documentation generators, and other tools used to produce a complete application.
JavaScript (JS) is an object-oriented programming language that allows engineers to produce and implement complex features within web browsers. JavaScript is popular because of its versatility and is preferred as the primary choice unless a specific function is needed. In this Zone, we provide resources that cover popular JS frameworks, server applications, supported data types, and other useful topics for a front-end engineer.
Programming languages allow us to communicate with computers, and they operate like sets of instructions. There are numerous types of languages, including procedural, functional, object-oriented, and more. Whether you’re looking to learn a new language or trying to find some tips or tricks, the resources in the Languages Zone will give you all the information you need and more.
Development and programming tools are used to build frameworks, and they can be used for creating, debugging, and maintaining programs — and much more. The resources in this Zone cover topics such as compilers, database management systems, code editors, and other software tools and can help ensure engineers are writing clean code.
Development at Scale
As organizations’ needs and requirements evolve, it’s critical for development to meet these demands at scale. The various realms in which mobile, web, and low-code applications are built continue to fluctuate. This Trend Report will further explore these development trends and how they relate to scalability within organizations, highlighting application challenges, code, and more.
Scaling Java Microservices to Extreme Performance Using NCache
Google BigQuery is a powerful cloud-based data warehousing solution that enables users to analyze massive datasets quickly and efficiently. In Python, BigQuery DataFrames provide a Pythonic interface for interacting with BigQuery, allowing developers to leverage familiar tools and syntax for data querying and manipulation. In this comprehensive developer guide, we'll explore the usage of BigQuery DataFrames, their advantages, disadvantages, and potential performance issues. Introduction To BigQuery DataFrames BigQuery DataFrames serve as a bridge between Google BigQuery and Python, allowing seamless integration of BigQuery datasets into Python workflows. With BigQuery DataFrames, developers can use familiar libraries like Pandas to query, analyze, and manipulate BigQuery data. This Pythonic approach simplifies the development process and enhances productivity for data-driven applications. Advantages of BigQuery DataFrames Pythonic Interface: BigQuery DataFrames provide a Pythonic interface for interacting with BigQuery, enabling developers to use familiar Python syntax and libraries. Integration With Pandas: Being compatible with Pandas, BigQuery DataFrames allow developers to leverage the rich functionality of Pandas for data manipulation. Seamless Query Execution: BigQuery DataFrames handle the execution of SQL queries behind the scenes, abstracting away the complexities of query execution. Scalability: Leveraging the power of Google Cloud Platform, BigQuery DataFrames offer scalability to handle large datasets efficiently. Disadvantages of BigQuery DataFrames Limited Functionality: BigQuery DataFrames may lack certain advanced features and functionalities available in native BigQuery SQL. Data Transfer Costs: Transferring data between BigQuery and Python environments may incur data transfer costs, especially for large datasets. API Limitations: While BigQuery DataFrames provide a convenient interface, they may have limitations compared to directly using the BigQuery API for complex operations. Prerequisites Google Cloud Platform (GCP) Account: Ensure an active GCP account with BigQuery access. Python Environment: Set up a Python environment with the required libraries (pandas, pandas_gbq, and google-cloud-bigquery). Project Configuration: Configure your GCP project and authenticate your Python environment with the necessary credentials. Using BigQuery DataFrames Install Required Libraries Install the necessary libraries using pip: Python pip install pandas pandas-gbq google-cloud-bigquery Authenticate GCP Credentials Authenticate your GCP credentials to enable interaction with BigQuery: Python from google.auth import load_credentials # Load GCP credentials credentials, _ = load_credentials() Querying BigQuery DataFrames Use pandas_gbq to execute SQL queries and retrieve results as a DataFrame: Python import pandas_gbq # SQL Query query = "SELECT * FROM `your_project_id.your_dataset_id.your_table_id`" # Execute Query and Retrieve DataFrame df = pandas_gbq.read_gbq(query, project_id="your_project_id", credentials=credentials) Writing to BigQuery Write a DataFrame to a BigQuery table using pandas_gbq: Python # Write DataFrame to BigQuery pandas_gbq.to_gbq(df, destination_table="your_project_id.your_dataset_id.your_new_table", project_id="your_project_id", if_exists="replace", credentials=credentials) Advanced Features SQL Parameters Pass parameters to your SQL queries dynamically: Python params = {"param_name": "param_value"} query = "SELECT * FROM `your_project_id.your_dataset_id.your_table_id` WHERE column_name = @param_name" df = pandas_gbq.read_gbq(query, project_id="your_project_id", credentials=credentials, dialect="standard", parameters=params) Schema Customization Customize the DataFrame schema during the write operation: Python schema = [{"name": "column_name", "type": "INTEGER"}, {"name": "another_column", "type": "STRING"}] pandas_gbq.to_gbq(df, destination_table="your_project_id.your_dataset_id.your_custom_table", project_id="your_project_id", if_exists="replace", credentials=credentials, table_schema=schema) Performance Considerations Data Volume: Performance may degrade with large datasets, especially when processing and transferring data between BigQuery and Python environments. Query Complexity: Complex SQL queries may lead to longer execution times, impacting overall performance. Network Latency: Network latency between the Python environment and BigQuery servers can affect query execution time, especially for remote connections. Best Practices for Performance Optimization Use Query Filters: Apply filters to SQL queries to reduce the amount of data transferred between BigQuery and Python. Optimize SQL Queries: Write efficient SQL queries to minimize query execution time and reduce resource consumption. Cache Query Results: Cache query results in BigQuery to avoid re-executing queries for repeated requests. Conclusion BigQuery DataFrames offer a convenient and Pythonic way to interact with Google BigQuery, providing developers with flexibility and ease of use. While they offer several advantages, developers should be aware of potential limitations and performance considerations. By following best practices and optimizing query execution, developers can harness the full potential of BigQuery DataFrames for data analysis and manipulation in Python.
Java interfaces, for a very long time, were just that — interfaces, an anemic set of function prototypes. Even then, there were non-standard uses of interfaces (for example, marker interfaces), but that's it. However, since Java 8, there have been substantial changes in the interfaces. Additions of default and static methods enabled many new possibilities. For example, enabled adding of new functionality to existing interfaces without breaking old code. Or hiding all implementations behind factory methods, enforcing the “code against interface” policy. The addition of sealed interfaces enabled the creation of true sum types and expressions in code design intents. Together, these changes made Java interfaces a powerful, concise, and expressive tool. Let’s take a look at some non-traditional applications of Java interfaces. Fluent Builder Fluent (or Staged) Builder is a pattern used to assemble object instances. Unlike the traditional Builder pattern, it prevents the creation of incomplete objects and enforces a fixed order of field initialization. These properties make it the preferred choice for reliable and maintainable code. The idea behind Fluent Builder is rather simple. Instead of returning the same Builder instance after setting a property, it returns a new type (class or interface), which has only one method, therefore guiding the developer through the process of instance initialization. A fluent builder may omit the build() method at the end; for instance, assembling ends once the last field is set. Unfortunately, the straightforward implementation of Fluent Builder is very verbose: Java public record NameAge(String firstName, String lastName, Option<String> middleName, int age) { public static NameAgeBuilderStage1 builder() { return new NameAgeBuilder(); } public static class NameAgeBuilder implements NameAgeBuilderStage1, NameAgeBuilderStage2, NameAgeBuilderStage3, NameAgeBuilderStage4 { private String firstName; private String lastName; private Option<String> middleName; @Override public NameAgeBuilderStage2 firstName(String firstName) { this.firstName = firstName; return this; } @Override public NameAgeBuilderStage3 lastName(String lastName) { this.lastName = lastName; return this; } @Override public NameAgeBuilderStage4 middleName(Option<String> middleName) { this.middleName = middleName; return this; } @Override public NameAge age(int age) { return new NameAge(firstName, lastName, middleName, age); } } public interface NameAgeBuilderStage1 { NameAgeBuilderStage2 firstName(String firstName); } public interface NameAgeBuilderStage2 { NameAgeBuilderStage3 lastName(String lastName); } public interface NameAgeBuilderStage3 { NameAgeBuilderStage4 middleName(Option<String> middleName); } public interface NameAgeBuilderStage4 { NameAge age(int age); } } It is also not very safe, as it is still possible to cast the returned interface to NameAgeBuilder and call the age() method, getting an incomplete object. We might notice that each interface is a typical functional interface with only one method inside. With this in mind, we may rewrite the code above into the following: Java public record NameAge(String firstName, String lastName, Option<String> middleName, int age) { static NameAgeBuilderStage1 builder() { return firstName -> lastName -> middleName -> age -> new NameAge(firstName, lastName, middleName, age); } public interface NameAgeBuilderStage1 { NameAgeBuilderStage2 firstName(String firstName); } public interface NameAgeBuilderStage2 { NameAgeBuilderStage3 lastName(String lastName); } public interface NameAgeBuilderStage3 { NameAgeBuilderStage4 middleName(Option<String> middleName); } public interface NameAgeBuilderStage4 { NameAge age(int age); } } Besides being much more concise, this version is not susceptible to (even hacky) premature object creation. Reduction of Implementation Although default methods were created to enable the extension of existing interfaces without breaking the existing implementation, this is not the only use for them. For a long time, if we needed multiple implementations of the same interface, where many implementations share some code, the only way to avoid code duplication was to create an abstract class and inherit those implementations from it. Although this avoided code duplication, this solution is relatively verbose and causes unnecessary coupling. The abstract class is a purely technical entity that has no corresponding part in the application domain. With default methods, abstract classes are no longer necessary; common functionality can be written directly in the interface, reducing boilerplate, eliminating coupling, and improving maintainability. But what if we go further? Sometimes, it is possible to express all necessary functionality using only very few implementation-specific methods. Ideally — just one. This makes implementation classes very compact and easy to reason about and maintain. Let’s, for example, implement Maybe<T> monad (yet another name for Optional<T>/Option<T>). No matter how rich and diverse API we’re planning to implement, it still could be expressed as a call to a single method, let’s call it fold(): Java <R> R fold(Supplier<? extends R> nothingMapper, Function<? super T, ? extends R> justMapper) This method accepts two functions; one is called when the value is present and another when the value is missing. The result of the application is just returned as the result of the implemented method. With this method, we can implement map() and flatMap() as: Java default <U> Maybe<U> map(Function<? super T, U> mapper) { return fold(Maybe::nothing, t -> just(mapper.apply(t))); } default <U> Maybe<U> flatMap(Function<? super T, Maybe<U>> mapper) { return fold(Maybe::nothing, mapper); } These implementations are universal and applicable to both variants. Note that since we have exactly two implementations, it makes perfect sense to make the interface sealed. And to even further reduce the amount of boilerplate — use records: Java public sealed interface Maybe<T> { default <U> Maybe<U> map(Function<? super T, U> mapper) { return fold(Maybe::nothing, t -> just(mapper.apply(t))); } default <U> Maybe<U> flatMap(Function<? super T, Maybe<U>> mapper) { return fold(Maybe::nothing, mapper); } <R> R fold(Supplier<? extends R> nothingMapper, Function<? super T, ? extends R> justMapper); static <T> Just<T> just(T value) { return new Just<>(value); } @SuppressWarnings("unchecked") static <T> Nothing<T> nothing() { return (Nothing<T>) Nothing.INSTANCE; } static <T> Maybe<T> maybe(T value) { return value == null ? nothing() : just(value); } record Just<T>(T value) implements Maybe<T> { public <R> R fold(Supplier<? extends R> nothingMapper, Function<? super T, ? extends R> justMapper) { return justMapper.apply(value); } } record Nothing<T>() implements Maybe<T> { static final Nothing<?> INSTANCE = new Nothing<>(); @Override public <R> R fold(Supplier<? extends R> nothingMapper, Function<? super T, ? extends R> justMapper) { return nothingMapper.get(); } } } Although this is not strictly necessary for demonstration, this implementation uses a shared constant for the implementation of 'Nothing', reducing allocation. Another interesting property of this implementation — it uses no if statement (nor ternary operator) for the logic. This improves performance and enables better optimization by the Java compiler. Another useful property of this implementation — it is convenient for pattern matching (unlike Java 'Optional'for example): Java var result = switch (maybe) { case Just<String>(var value) -> value; case Nothing<String> nothing -> "Nothing"; }; But sometimes, even implementation classes are not necessary. The example below shows how the entire implementation fits into the interface (full code can be found here): Java public interface ShortenedUrlRepository { default Promise<ShortenedUrl> create(ShortenedUrl shortenedUrl) { return QRY."INSERT INTO shortenedurl (\{template().fieldNames()}) VALUES (\{template().fieldValues(shortenedUrl)}) RETURNING *" .in(db()) .asSingle(template()); } default Promise<ShortenedUrl> read(String id) { return QRY."SELECT * FROM shortenedurl WHERE id = \{id}" .in(db()) .asSingle(template()); } default Promise<Unit> delete(String id) { return QRY."DELETE FROM shortenedurl WHERE id = \{id}" .in(db()) .asUnit(); } DbEnv db(); } To turn this interface into a working instance, all we need is to provide an instance of the environment. For example, like this: Java var dbEnv = DbEnv.with(dbEnvConfig); ShortenedUrlRepository repository = () -> dbEnv; This approach sometimes results in code that is too concise and sometimes requires writing a more verbose version to preserve context. I’d say that this is quite an unusual property for Java code, which is often blamed for verbosity. Utility … Interfaces? Well, utility (as well as constant) interfaces were not feasible for a long time. Perhaps the main reason is that such interfaces could be implemented, and constants, as well as utility functions, would be (unnecessary) part of the implementation. But with sealed interfaces, this issue can be solved in a way similar to how instantiation of utility classes is prevented: Java public sealed interface Utility { ... record unused() implements Utility {} } At first look, it makes no big sense to use this approach. However, the use of an interface eliminates the need for visibility modifiers for each method and/or constant. This, in turn, reduces the amount of syntactic noise, which is mandatory for classes but redundant for interfaces, as they have all their members public. Interfaces and Private Records The combination of these two constructs enables convenient writing code in “OO without classes” style, enforcing “code against interface” while reducing boilerplate at the same time. For example: Java public interface ContentType { String headerText(); ContentCategory category(); static ContentType custom(String headerText, ContentCategory category) { record contentType(String headerText, ContentCategory category) implements ContentType {} return new contentType(headerText, category); } } The private record serves two purposes: It keeps the use of implementation under complete control. No direct instantiations are possible, only via the static factory method. Keeps implementation close to the interface, simplifying support, extension, and maintenance. Note that the interface is not sealed, so one can do, for example, the following: Java public enum CommonContentTypes implements ContentType { TEXT_PLAIN("text/plain; charset=UTF-8", ContentCategory.PLAIN_TEXT), APPLICATION_JSON("application/json; charset=UTF-8", ContentCategory.JSON), ; private final String headerText; private final ContentCategory category; CommonContentTypes(String headerText, ContentCategory category) { this.headerText = headerText; this.category = category; } @Override public String headerText() { return headerText; } @Override public ContentCategory category() { return category; } } Conclusion Interfaces are a powerful Java feature, often underestimated and underutilized. This article is an attempt to shed light on the possible ways to utilize their power and get clean, expressive, concise, yet readable code.
In the world of Spring Boot, making HTTP requests to external services is a common task. Traditionally, developers have relied on RestTemplate for this purpose. However, with the evolution of the Spring Framework, a new and more powerful way to handle HTTP requests has emerged: the WebClient. In Spring Boot 3.2, a new addition called RestClient builds upon WebClient, providing a more intuitive and modern approach to consuming RESTful services. Origins of RestTemplate RestTemplate has been a staple in the Spring ecosystem for years. It's a synchronous client for making HTTP requests and processing responses. With RestTemplate, developers could easily interact with RESTful APIs using familiar Java syntax. However, as applications became more asynchronous and non-blocking, the limitations of RestTemplate started to become apparent. Here's a basic example of using RestTemplate to fetch data from an external API: Java var restTemplate = new RestTemplate(); var response = restTemplate.getForObject("https://api.example.com/data", String.class); System.out.println(response); Introduction of WebClient With the advent of Spring WebFlux, an asynchronous, non-blocking web framework, WebClient was introduced as a modern alternative to RestTemplate. WebClient embraces reactive principles, making it well-suited for building reactive applications. It offers support for both synchronous and asynchronous communication, along with a fluent API for composing requests. Here's how you would use WebClient to achieve the same HTTP request: Java var webClient = WebClient.create(); var response = webClient.get() .uri("https://api.example.com/data") .retrieve() .bodyToMono(String.class); response.subscribe(System.out::println); Enter RestClient in Spring Boot 3.2 Spring Boot 3.2 brings RestClient, a higher-level abstraction built on top of WebClient. RestClient simplifies the process of making HTTP requests even further by providing a more intuitive fluent API and reducing boilerplate code. It retains all the capabilities of WebClient while offering a more developer-friendly interface. Let's take a look at how RestClient can be used: var response = restClient .get() .uri(cepURL) .retrieve() .toEntity(String.class); System.out.println(response.getBody()); With RestClient, the code becomes more concise and readable. The RestClient handles the creation of WebClient instances internally, abstracting away the complexities of setting up and managing HTTP clients. Comparing RestClient With RestTemplate Let's compare RestClient with RestTemplate by looking at some common scenarios: Create RestTemplate: var response = new RestTemplate(); RestClient: var response = RestClient.create(); Or we can use our old RestTemplate as well: var myOldRestTemplate = new RestTemplate(); var response = RestClient.builder(myOldRestTemplate); GET Request RestTemplate: Java var response = restTemplate.getForObject("https://api.example.com/data", String.class); RestClient: var response = restClient .get() .uri(cepURL) .retrieve() .toEntity(String.class); POST Request RestTemplate: Java ResponseEntity<String> response = restTemplate.postForEntity("https://api.example.com/data", request, String.class); RestClient: var response = restClient .post() .uri("https://api.example.com/data") .body(request) .retrieve() .toEntity(String.class); Error Handling RestTemplate: Java try { String response = restTemplate.getForObject("https://api.example.com/data", String.class); } catch (RestClientException ex) { // Handle exception } RestClient: String request = restClient.get() .uri("https://api.example.com/this-url-does-not-exist") .retrieve() .onStatus(HttpStatusCode::is4xxClientError, (request, response) -> { throw new MyCustomRuntimeException(response.getStatusCode(), response.getHeaders()) }) .body(String.class); As seen in these examples, RestClient offers a more streamlined approach to making HTTP requests compared to RestTemplate. Spring Documentation gives us many other examples. Conclusion In Spring Boot 3.2, RestClient emerges as a modern replacement for RestTemplate, offering a more intuitive and concise way to consume RESTful services. Built on top of WebClient, RestClient embraces reactive principles while simplifying the process of making HTTP requests. Developers can now enjoy improved productivity and cleaner code when interacting with external APIs in their Spring Boot applications. It's recommended to transition from RestTemplate to RestClient for a more efficient and future-proof codebase.
A sign of a good understanding of a programming language is not whether one is simply knowledgeable about the language’s functionality, but why such functionality exists. Without knowing this “why," the developer runs the risk of using functionality in situations where its use might not be ideal - or even should be avoided in its entirety! The case in point for this article is the lateinit keyword in Kotlin. Its presence in the programming language is more or less a way to resolve what would otherwise be contradictory goals for Kotlin: Maintain compatibility with existing Java code and make it easy to transcribe from Java to Kotlin. If Kotlin were too dissimilar to Java - and if the interaction between Kotlin and Java code bases were too much of a hassle - then adoption of the language might have never taken off. Prevent developers from declaring class members without explicitly declaring their value, either directly or via constructors. In Java, doing so will assign a default value, and this leaves non-primitives - which are assigned a null value - at the risk of provoking a NullPointerException if they are accessed without a value being provided beforehand. The problem here is this: what happens when it’s impossible to declare a class field’s value immediately? Take, for example, the extension model in the JUnit 5 testing framework. Extensions are a tool for creating reusable code that conducts setup and cleanup actions before and after the execution of each or all tests. Below is an example of an extension whose purpose is to clear out all designated database tables after the execution of each test via a Spring bean that serves as the database interface: Java public class DBExtension implements BeforeAllCallback, AfterEachCallback { private NamedParameterJdbcOperations jdbcOperations; @Override public void beforeAll(ExtensionContext extensionContext) { jdbcOperations = SpringExtension.getApplicationContext(extensionContext) .getBean(NamedParameterJdbcTemplate.class); clearDB(); } @Override public void afterEach(ExtensionContext extensionContext) throws Exception { clearDB(); } private void clearDB() { Stream.of("table_one", "table_two", "table_three").forEach((tableName) -> jdbcOperations.update("TRUNCATE " + tableName, new MapSqlParameterSource()) ); } } (NOTE: Yes, using the @Transactional annotation is possible for tests using Spring Boot tests that conduct database transactions, but some use cases make automated transaction roll-backs impossible; for example, when a separate thread is spawned to execute the code for the database interactions.) Given that the field jdbcOperations relies on the Spring framework loading the proper database interface bean when the application is loaded, it cannot be assigned any substantial value upon declaration. Thus, it receives an implicit default value of null until the beforeAll() function is executed. As described above, this approach is forbidden in Kotlin, so the developer has three options: Declare jdbcOperations as var, assign a garbage value to it in its declaration, then assign the “real” value to the field in beforeAll(): Kotlin class DBExtension : BeforeAllCallback, AfterEachCallback { private var jdbcOperations: NamedParameterJdbcOperations = StubJdbcOperations() override fun beforeAll(extensionContext: ExtensionContext) { jdbcOperations = SpringExtension.getApplicationContext(extensionContext) .getBean(NamedParameterJdbcOperations::class.java) clearDB() } override fun afterEach(extensionContext: ExtensionContext) { clearDB() } private fun clearDB() { listOf("table_one", "table_two", "table_three").forEach { tableName: String -> jdbcOperations.update("TRUNCATE $tableName", MapSqlParameterSource()) } } } The downside here is that there’s no check for whether the field has been assigned the “real” value, running the risk of invalid behavior when the field is accessed if the “real” value hasn’t been assigned for whatever reason. 2. Declare jdbcOperations as nullable and assign null to the field, after which the field will be assigned its “real” value in beforeAll(): Kotlin class DBExtension : BeforeAllCallback, AfterEachCallback { private var jdbcOperations: NamedParameterJdbcOperations? = null override fun beforeAll(extensionContext: ExtensionContext) { jdbcOperations = SpringExtension.getApplicationContext(extensionContext) .getBean(NamedParameterJdbcOperations::class.java) clearDB() } override fun afterEach(extensionContext: ExtensionContext) { clearDB() } private fun clearDB() { listOf("table_one", "table_two", "table_three").forEach { tableName: String -> jdbcOperations!!.update("TRUNCATE $tableName", MapSqlParameterSource()) } } } The downside here is that declaring the field as nullable is permanent; there’s no mechanism to declare a type as nullable “only” until its value has been assigned elsewhere. Thus, this approach forces the developer to force the non-nullable conversion whenever accessing the field, in this case using the double-bang (i.e. !!) operator to access the field’s update() function. 3. Utilize the lateinit keyword to postpone a value assignment to jdbcOperations until the execution of the beforeAll() function: Kotlin class DBExtension : BeforeAllCallback, AfterEachCallback { private lateinit var jdbcOperations: NamedParameterJdbcOperations override fun beforeAll(extensionContext: ExtensionContext) { jdbcOperations = SpringExtension.getApplicationContext(extensionContext) .getBean(NamedParameterJdbcOperations::class.java) clearDB() } override fun afterEach(extensionContext: ExtensionContext) { clearDB() } private fun clearDB() { listOf("table_one", "table_two", "table_three").forEach { tableName: String -> jdbcOperations.update("TRUNCATE $tableName", MapSqlParameterSource()) } } } No more worrying about silently invalid behavior or being forced to “de-nullify” the field each time it’s being accessed! The “catch” is that there’s still no compile-time mechanism for determining whether the field has been accessed before it’s been assigned a value - it’s done at run-time, as can be seen when decompiling the clearDB() function: Java private final void clearDB() { Iterable $this$forEach$iv = (Iterable)CollectionsKt.listOf(new String[]{"table_one", "table_two", "table_three"}); int $i$f$forEach = false; NamedParameterJdbcOperations var10000; String tableName; for(Iterator var3 = $this$forEach$iv.iterator(); var3.hasNext(); var10000.update("TRUNCATE " + tableName, (SqlParameterSource)(new MapSqlParameterSource()))) { Object element$iv = var3.next(); tableName = (String)element$iv; int var6 = false; var10000 = this.jdbcOperations; if (var10000 == null) { Intrinsics.throwUninitializedPropertyAccessException("jdbcOperations"); } } } Not ideal, considering what’s arguably Kotlin’s star feature (compile-time checking of variable nullability to reduce the likelihood of the “Billion-Dollar Mistake”) - but again, it’s a “least-worst” compromise to bridge the gap between Kotlin code and the Java-based code that provides no alternatives that adhere to Kotlin’s design philosophy. Use Wisely! Aside from the above-mentioned issue of conducting null checks only at run-time instead of compile-time, lateinit possesses a few more drawbacks: A field that uses lateinit cannot be an immutable val, as its value is being assigned at some point after the field’s declaration, so the field is exposed to the risk of inadvertently being modified at some point by an unwitting developer and causing logic errors. Because the field is not instantiated upon declaration, any other fields that rely on this field - be it via some function call to the field or passing it in as an argument to a constructor - cannot be instantiated upon declaration as well. This makes lateinit a bit of a “viral” feature: using it on field A forces other fields that rely on field A to use lateinit as well. Given that this mutability of lateinit fields goes against another one of Kotlin’s guiding principles - make fields and variables immutable where possible (for example, function arguments are completely immutable) to avoid logic errors by mutating a field/variable that shouldn’t have been changed - its use should be restricted to where no alternatives exist. Unfortunately, several code patterns that are prevalent in Spring Boot and Mockito - and likely elsewhere, but that’s outside the scope of this article - were built on Java’s tendency to permit uninstantiated field declarations. This is where the ease of transcribing Java code to Kotlin code becomes a double-edged sword: it’s easy to simply move the Java code over to a Kotlin file, slap the lateinit keyword on a field that hasn’t been directly instantiated in the Java code, and call it a day. Take, for instance, a test class that: Auto-wires a bean that’s been registered in the Spring Boot component ecosystem Injects a configuration value that’s been loaded in the Spring Boot environment Mocks a field’s value and then passes said mock into another field’s object Creates an argument captor for validating arguments that are passed to specified functions during the execution of one or more test cases Instantiates a mocked version of a bean that has been registered in the Spring Boot component ecosystem and passes it to a field in the test class Here is the code for all of these points put together: Kotlin @SpringBootTest @ExtendWith(MockitoExtension::class) @AutoConfigureMockMvc class FooTest { @Autowired private lateinit var mockMvc: MockMvc @Value("\${foo.value}") private lateinit var fooValue: String @Mock private lateinit var otherFooRepo: OtherFooRepo @InjectMocks private lateinit var otherFooService: OtherFooService @Captor private lateinit var timestampCaptor: ArgumentCaptor<Long> @MockBean private lateinit var fooRepo: FooRepo // Tests below } A better world is possible! Here are ways to avoid each of these constructs so that one can write “good” idiomatic Kotlin code while still retaining the use of auto wiring, object mocking, and argument capturing in the tests. Becoming “Punctual” Note: The code in these examples uses Java 17, Kotlin 1.9.21, Spring Boot 3.2.0, and Mockito 5.7.0. @Autowired/@Value Both of these constructs originate in the historic practice of having Spring Boot inject the values for the fields in question after their containing class has been initialized. This practice has since been deprecated in favor of declaring the values that are to be injected into the fields as arguments for the class’s constructor. For example, this code follows the old practice: Kotlin @Service class FooService { @Autowired private lateinit var fooRepo: FooRepo @Value("\${foo.value}") private lateinit var fooValue: String } It can be updated to this code: Kotlin @Service class FooService( private val fooRepo: FooRepo, @Value("\${foo.value}") private val fooValue: String, ) { } Note that aside from being able to use the val keyword, the @Autowired annotation can be removed from the declaration of fooRepo as well, as the Spring Boot injection mechanism is smart enough to recognize that fooRepo refers to a bean that can be instantiated and passed in automatically. Omitting the @Autowired annotation isn’t possible for testing code: test files aren't actually a part of the Spring Boot component ecosystem, and thus, won’t know by default that they need to rely on the auto-wired resource injection system - but otherwise, the pattern is the same: Kotlin @SpringBootTest @ExtendWith(MockitoExtension::class) @AutoConfigureMockMvc class FooTest( @Autowired private val mockMvc: MockMvc, @Value("\${foo.value}") private val fooValue: String, ) { @Mock private lateinit var otherFooRepo: OtherFooRepo @InjectMocks private lateinit var otherFooService: OtherFooService @Captor private lateinit var timestampCaptor: ArgumentCaptor<Long> @MockBean private lateinit var fooRepo: FooRepo // Tests below } @Mock/@InjectMocks The Mockito extension for JUnit allows a developer to declare a mock object and leave the actual mock instantiation and resetting of the mock’s behavior - as well as injecting these mocks into the dependent objects like otherFooService in the example code - to the code within MockitoExtension. Aside from the disadvantages mentioned above about being forced to use mutable objects, it poses quite a bit of “magic” around the lifecycle of the mocked objects that can be easily avoided by directly instantiating and manipulating the behavior of said objects: Kotlin @SpringBootTest @ExtendWith(MockitoExtension::class) @AutoConfigureMockMvc class FooTest( @Autowired private val mockMvc: MockMvc, @Value("\${foo.value}") private val fooValue: String, ) { private val otherFooRepo: OtherFooRepo = mock() private val otherFooService = OtherFooService(otherFooRepo) @Captor private lateinit var timestampCaptor: ArgumentCaptor<Long> @MockBean private lateinit var fooRepo: FooRepo @AfterEach fun afterEach() { reset(otherFooRepo) } // Tests below } As can be seen above, a post-execution hook is now necessary to clean up the mocked object otherFooRepo after the test execution(s), but this drawback is more than made up for by otherfooRepo and otherFooService now being immutable as well as having complete control over both objects’ lifetimes. @Captor Just as with the @Mock annotation, it’s possible to remove the @Captor annotation from the argument captor and declare its value directly in the code: Kotlin @SpringBootTest @AutoConfigureMockMvc class FooTest( @Autowired private val mockMvc: MockMvc, @Value("\${foo.value}") private val fooValue: String, ) { private val otherFooRepo: OtherFooRepo = mock() private val otherFooService = OtherFooService(otherFooRepo) private val timestampCaptor: ArgumentCaptor<Long> = ArgumentCaptor.captor() @MockBean private lateinit var fooRepo: FooRepo @AfterEach fun afterEach() { reset(otherFooRepo) } // Tests below } While there’s a downside in that there’s no mechanism in resetting the argument captor after each test (meaning that a call to getAllValues() would return artifacts from other test cases’ executions), there’s the case to be made that an argument captor could be instantiated as an object within only the test cases where it is to be used and done away with using an argument captor as a test class’s field. In any case, now that both @Mock and @Captor have been removed, it’s possible to remove the Mockito extension as well. @MockBean A caveat here: the use of mock beans in Spring Boot tests could be considered a code smell, signaling that, among other possible issues, the IO layer of the application isn’t being properly controlled for integration tests, that the test is de-facto a unit test and should be rewritten as such, etc. Furthermore, too much usage of mocked beans in different arrangements can cause test execution times to spike. Nonetheless, if it’s absolutely necessary to use mocked beans in the tests, a solution does exist for converting them into immutable objects. As it turns out, the @MockBean annotation can be used not just on field declarations, but also for class declarations as well. Furthermore, when used at the class level, it’s possible to pass in the classes that are to be declared as mock beans for the test in the value array for the annotation. This results in the mock bean now being eligible to be declared as an @Autowired bean just like any “normal” Spring Boot bean being passed to a test class: Kotlin @SpringBootTest @AutoConfigureMockMvc @MockBean(value = [FooRepo::class]) class FooTest( @Autowired private val mockMvc: MockMvc, @Value("\${foo.value}") private val fooValue: String, @Autowired private val fooRepo: FooRepo, ) { private val otherFooRepo: OtherFooRepo = mock() private val otherFooService = OtherFooService(otherFooRepo) private val timestampCaptor: ArgumentCaptor<Long> = ArgumentCaptor.captor() @AfterEach fun afterEach() { reset(fooRepo, otherFooRepo) } // Tests below } Note that like otherFooRepo, the object will have to be reset in the cleanup hook. Also, there’s no indication that fooRepo is a mocked object as it’s being passed to the constructor of the test class, so writing patterns like declaring all mocked beans in an abstract class and then passing them to specific extending test classes when needed runs the risk of “out of sight, out of mind” in that the knowledge that the bean is mocked is not inherently evident. Furthermore, better alternatives to mocking beans exist (for example, WireMock and Testcontainers) to handle mocking out the behavior of external components. Conclusion Note that each of these techniques is possible for code written in Java as well and provides the very same benefits of immutability and control of the objects’ lifecycles. What makes these recommendations even more pertinent to Kotlin is that they allow the user to align more closely with Kotlin’s design philosophy. Kotlin isn’t simply “Java with better typing:" It’s a programming language that places an emphasis on reducing common programming errors like accidentally accessing null pointers as well as items like inadvertently re-assigning objects and other pitfalls. Going beyond merely looking up the tools that are at one’s disposal in Kotlin to finding out why they’re available in the form that they exist will yield dividends of much higher productivity in the language, less risks of trying to fight against the language instead of focusing on solving the tasks at hand, and, quite possibly, making writing code in the language an even more rewarding and fun experience.
The calculation of the norm of vectors is essential in both artificial intelligence and quantum computing for tasks such as feature scaling, regularization, distance metrics, convergence criteria, representing quantum states, ensuring unitarity of operations, error correction, and designing quantum algorithms and circuits. You will learn how to calculate the Euclidean (norm/distance), also known as the L2 norm, of a single-dimensional (1D) tensor in Python libraries like NumPy, SciPy, Scikit-Learn, TensorFlow, and PyTorch. Understand Norm vs Distance Before we begin, let's understand the difference between Euclidean norm vs Euclidean distance. Norm is the distance/length/size of the vector from the origin (0,0). Distance is the distance/length/size between two vectors. Prerequisites Install Jupyter. Run the code below in a Jupyter Notebook to install the prerequisites. Python # Install the prerequisites for you to run the notebook !pip install numpy !pip install scipy %pip install torch !pip install tensorflow You will use Jupyter Notebook to run the Python code cells to calculate the L2 norm in different Python libraries. Let's Get Started Now that you have Jupyter set up on your machine and installed the required Python libraries, let's get started by defining a 1D tensor using NumPy. NumPy NumPy is a Python library used for scientific computing. NumPy provides a multidimensional array and other derived objects. Tensor ranks Python # Define a single dimensional (1D) tensor import numpy as np vector1 = np.array([3,7]) #np.random.randint(1,5,2) vector2 = np.array([5,2]) #np.random.randint(1,5,2) print("Vector 1:",vector1) print("Vector 2:",vector2) print(f"shape & size of Vector1 & Vector2:", vector1.shape, vector1.size) Print the vectors Plain Text Vector 1: [3 7] Vector 2: [5 2] shape & size of Vector1 & Vector2: (2,) 2 Matplotlib Matplotlib is a Python visualization library for creating static, animated, and interactive visualizations. You will use Matplotlib's quiver to plot the vectors. Python # Draw the vectors using MatplotLib import matplotlib.pyplot as plt %matplotlib inline origin = np.array([0,0]) plt.quiver(*origin, vector1[0],vector1[1], angles='xy', color='r', scale_units='xy', scale=1) plt.quiver(*origin, vector2[0],vector2[1], angles='xy', color='b', scale_units='xy', scale=1) plt.plot([vector1[0],vector2[0]], [vector1[1],vector2[1]], 'go', linestyle="--") plt.title('Vector Representation') plt.xlim([0,10]) plt.ylim([0,10]) plt.grid() plt.show() Vector representation using Matplolib Python # L2 (Euclidean) norm of a vector # NumPy norm1 = np.linalg.norm(vector1, ord=2) print("The magnitude / distance from the origin",norm1) norm2 = np.linalg.norm(vector2, ord=2) print("The magnitude / distance from the origin",norm2) The output once you run this in the Jupyter Notebook: Plain Text The magnitude / distance from the origin 7.615773105863909 The magnitude / distance from the origin 5.385164807134504 SciPy SciPy is built on NumPy and is used for mathematical computations. If you observe, SciPy uses the same linalg functions as NumPy. Python # SciPy import scipy norm_vector1 = scipy.linalg.norm(vector1, ord=2) print("L2 norm in scipy for vector1:", norm_vector1) norm_vector2 = scipy.linalg.norm(vector2, ord=2) print("L2 norm in scipy for vector2:", norm_vector2) Output: Plain Text L2 norm in scipy for vector1: 7.615773105863909 L2 norm in scipy for vector2: 5.385164807134504 Scikit-Learn As the Scikit-learn documentation says: Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities. We reshape the vector as Scikit-learn expects the vector to be 2-dimensional. Python # Sklearn from sklearn.metrics.pairwise import euclidean_distances vector1_reshape = vector1.reshape(1,-1) ## Scikit-learn expects the vector to be 2-Dimensional euclidean_distances(vector1_reshape, [[0, 0]])[0,0] Output Plain Text 7.615773105863909 TensorFlow TensorFlow is an end-to-end machine learning platform. Python # TensorFlow import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1' import tensorflow as tf print("TensorFlow version:", tf.__version__) ## Tensorflow expects Tensor of types float32, float64, complex64, complex128 vector1_tf = vector1.astype(np.float64) tf_norm = tf.norm(vector1_tf, ord=2) print("Euclidean(l2) norm in TensorFlow:",tf_norm.numpy()) Output The output prints the version of TensorFlow and the L2 norm: Plain Text TensorFlow version: 2.15.0 Euclidean(l2) norm in TensorFlow: 7.615773105863909 PyTorch PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. Python # PyTorch import torch print("PyTorch version:", torch.__version__) norm_torch = torch.linalg.norm(torch.from_numpy(vector1_tf), ord=2) norm_torch.item() The output prints the PyTorch version and the norm: Plain Text PyTorch version: 2.1.2 7.615773105863909 Euclidean Distance Euclidean distance is calculated in the same way as a norm, except that you calculate the difference between the vectors before passing the difference - vector_diff, in this case, to the respective libraries. Python # Euclidean distance between the vectors import math vector_diff = vector1 - vector2 # Using norm euclidean_distance = np.linalg.norm(vector_diff, ord=2) print(euclidean_distance) # Using dot product norm_dot = math.sqrt(np.dot(vector_diff.T,vector_diff)) print(norm_dot) Output Output using the norm and dot functions of NumPy libraries: Plain Text 5.385164807134504 5.385164807134504 Python # SciPy from scipy.spatial import distance distance.euclidean(vector1,vector2) Output Using SciPy 5.385164807134504 The Jupyter Notebook with the outputs is available on the GitHub repository. You can run the Jupyter Notebook on Colab following the instructions on the GitHub repository.
NoSQL databases provide a flexible and scalable option for storing and retrieving data in database management. However, they can need help with object-oriented programming paradigms, such as inheritance, which is a fundamental concept in languages like Java. This article explores the impedance mismatch when dealing with inheritance in NoSQL databases. The Inheritance Challenge in NoSQL Databases The term “impedance mismatch” refers to the disconnect between the object-oriented world of programming languages like Java and NoSQL databases’ tabular, document-oriented, or graph-based structures. One area where this mismatch is particularly evident is in handling inheritance. In Java, inheritance allows you to create a hierarchy of classes, where a subclass inherits properties and behaviors from its parent class. This concept is deeply ingrained in Java programming and is often used to model real-world relationships. However, NoSQL databases have no joins, and the inheritance structure needs to be handled differently. Jakarta Persistence (JPA) and Inheritance Strategies Before diving into more advanced solutions, it’s worth mentioning that there are strategies to simulate inheritance in relational databases in the world of Jakarta Persistence (formerly known as JPA). These strategies include: JOINED inheritance strategy: In this approach, fields specific to a subclass are mapped to a separate table from the fields common to the parent class. A join operation is performed to instantiate the subclass when needed. SINGLE_TABLE inheritance strategy: This strategy uses a single table representing the entire class hierarchy. Discriminator columns are used to differentiate between different subclasses. TABLE_PER_CLASS inheritance strategy: Each concrete entity class in the hierarchy corresponds to its table in the database. These strategies work well in relational databases but are not directly applicable to NoSQL databases, primarily because NoSQL databases do not support traditional joins. Live Code Session: Java SE, Eclipse JNoSQL, and MongoDB In this live code session, we will create a Java SE project using MongoDB as our NoSQL database. We’ll focus on managing game characters, specifically Mario and Sonic characters, using Eclipse JNoSQL. You can run MongoDB locally using Docker or in the cloud with AtlasDB. We’ll start with the database setup and then proceed to the Java code implementation. Setting Up MongoDB Locally To run MongoDB locally, you can use Docker with the following command: Shell docker run -d --name mongodb-instance -p 27017:27017 mongo Alternatively, you can choose to execute it in the cloud by following the instructions provided by MongoDB AtlasDB. With the MongoDB database up and running, let’s create our Java project. Creating the Java Project We’ll create a Java SE project using Maven and the maven-archetype-quickstart archetype. This project will utilize the following technologies and dependencies: Jakarta CDI Jakarta JSONP Eclipse MicroProfile Eclipse JNoSQL database Maven Dependencies Add the following dependencies to your project’s pom.xml file: XML dependencies> <dependency> <groupId>org.jboss.weld.se</groupId> <artifactId>weld-se-shaded</artifactId> <version>${weld.se.core.version}</version> <scope>compile</scope> </dependency> <dependency> <groupId>org.eclipse</groupId> <artifactId>yasson</artifactId> <version>3.0.3</version> <scope>compile</scope> </dependency> <dependency> <groupId>io.smallrye.config</groupId> <artifactId>smallrye-config-core</artifactId> <version>3.2.1</version> <scope>compile</scope> </dependency> <dependency> <groupId>org.eclipse.microprofile.config</groupId> <artifactId>microprofile-config-api</artifactId> <version>3.0.2</version> <scope>compile</scope> </dependency> <dependency> <groupId>org.eclipse.jnosql.databases</groupId> <artifactId>jnosql-mongodb</artifactId> <version>${jnosql.version}</version> </dependency> <dependency> <groupId>net.datafaker</groupId> <artifactId>datafaker</artifactId> <version>2.0.2</version> </dependency> </dependencies> Make sure to replace ${jnosql.version} with the appropriate version of Eclipse JNoSQL you intend to use. In the next section, we will proceed with implementing our Java code. Implementing Our Java Code Our GameCharacter class will serve as the parent class for all game characters and will hold the common attributes shared among them. We’ll use inheritance and discriminator columns to distinguish between Sonic’s and Mario’s characters. Here’s the initial definition of the GameCharacter class: Java @Entity @DiscriminatorColumn("type") @Inheritance public abstract class GameCharacter { @Id @Convert(UUIDConverter.class) protected UUID id; @Column protected String character; @Column protected String game; public abstract GameType getType(); } In this code: We annotate the class with @Entity to indicate that it is a persistent entity in our MongoDB database. We use @DiscriminatorColumn("type") to specify that a discriminator column named “type” will be used to differentiate between subclasses. @Inheritance indicates that this class is part of an inheritance hierarchy. The GameCharacter class has a unique identifier (id), attributes for character name (character) and game name (game), and an abstract method getType(), which its subclasses will implement to specify the character type. Specialization Classes: Sonic and Mario Now, let’s create the specialization classes for Sonic and Mario entities. These classes will extend the GameCharacter class and provide additional attributes specific to each character type. We’ll use @DiscriminatorValue to define the values the “type” discriminator column can take for each subclass. Java @Entity @DiscriminatorValue("SONIC") public class Sonic extends GameCharacter { @Column private String zone; @Override public GameType getType() { return GameType.SONIC; } } In the Sonic class: We annotate it with @Entity to indicate it’s a persistent entity. @DiscriminatorValue("SONIC") specifies that the “type” discriminator column will have the value “SONIC” for Sonic entities. We add an attribute zone-specific to Sonic characters. The getType() method returns GameType.SONIC, indicating that this is a Sonic character. Java @Entity @DiscriminatorValue("MARIO") public class Mario extends GameCharacter { @Column private String locations; @Override public GameType getType() { return GameType.MARIO; } } Similarly, in the Mario class: We annotate it with @Entity to indicate it’s a persistent entity. @DiscriminatorValue("MARIO") specifies that the “type” discriminator column will have the value “MARIO” for Mario entities. We add an attribute locations specific to Mario characters. The getType() method returns GameType.MARIO, indicating that this is a Mario character. With this modeling approach, you can easily distinguish between Sonic and Mario characters in your MongoDB database using the discriminator column “type.” We will create our first database integration with MongoDB using Eclipse JNoSQL. To simplify, we will generate data using the Data Faker library. Our Java application will insert Mario and Sonic characters into the database and perform basic operations. Application Code Here’s the main application code that generates and inserts data into the MongoDB database: Java public class App { public static void main(String[] args) { try (SeContainer container = SeContainerInitializer.newInstance().initialize()) { DocumentTemplate template = container.select(DocumentTemplate.class).get(); DataFaker faker = new DataFaker(); Mario mario = Mario.of(faker.generateMarioData()); Sonic sonic = Sonic.of(faker.generateSonicData()); // Insert Mario and Sonic characters into the database template.insert(List.of(mario, sonic)); // Count the total number of GameCharacter documents long count = template.count(GameCharacter.class); System.out.println("Total of GameCharacter: " + count); // Find all Mario characters in the database List<Mario> marioCharacters = template.select(Mario.class).getResultList(); System.out.println("Find all Mario characters: " + marioCharacters); // Find all Sonic characters in the database List<Sonic> sonicCharacters = template.select(Sonic.class).getResultList(); System.out.println("Find all Sonic characters: " + sonicCharacters); } } } In this code: We use the SeContainer to manage our CDI container and initialize the DocumentTemplate from Eclipse JNoSQL. We create instances of Mario and Sonic characters using data generated by the DataFaker class. We insert these characters into the MongoDB database using the template.insert() method. We count the total number of GameCharacter documents in the database. We retrieve and display all Mario and Sonic characters from the database. Resulting Database Structure As a result of running this code, you will see data in your MongoDB database similar to the following structure: JSON [ { "_id": "39b8901c-669c-49db-ac42-c1cabdcbb6ed", "character": "Bowser", "game": "Super Mario Bros.", "locations": "Mount Volbono", "type": "MARIO" }, { "_id": "f60e1ada-bfd9-4da7-8228-6a7f870e3dc8", "character": "Perfect Chaos", "game": "Sonic Rivals 2", "type": "SONIC", "zone": "Emerald Hill Zone" } ] As shown in the database structure, each document contains a unique identifier (_id), character name (character), game name (game), and a discriminator column type to differentiate between Mario and Sonic characters. You will see more characters in your MongoDB database depending on your generated data. This integration demonstrates how to insert, count, and retrieve game characters using Eclipse JNoSQL and MongoDB. You can extend and enhance this application to manage and manipulate your game character data as needed. We will create repositories for managing game characters using Eclipse JNoSQL. We will have a Console repository for general game characters and a SonicRepository specifically for Sonic characters. These repositories will allow us to interact with the database and perform various operations easily. Let’s define the repositories for our game characters. Console Repository Java @Repository public interface Console extends PageableRepository<GameCharacter, UUID> { } The Console repository extends PageableRepository and is used for general game characters. It provides common CRUD operations and pagination support. Sonic Repository Java @Repository public interface SonicRepository extends PageableRepository<Sonic, UUID> { } The SonicRepository extends PageableRepository but is specifically designed for Sonic characters. It inherits common CRUD operations and pagination from the parent repository. Main Application Code Now, let’s modify our main application code to use these repositories. For Console Repository Java public static void main(String[] args) { Faker faker = new Faker(); try (SeContainer container = SeContainerInitializer.newInstance().initialize()) { Console repository = container.select(Console.class).get(); for (int index = 0; index < 5; index++) { Mario mario = Mario.of(faker); Sonic sonic = Sonic.of(faker); repository.saveAll(List.of(mario, sonic)); } long count = repository.count(); System.out.println("Total of GameCharacter: " + count); System.out.println("Find all game characters: " + repository.findAll().toList()); } System.exit(0); } In this code, we use the Console repository to save both Mario and Sonic characters, demonstrating its ability to manage general game characters. For Sonic Repository Java public static void main(String[] args) { Faker faker = new Faker(); try (SeContainer container = SeContainerInitializer.newInstance().initialize()) { SonicRepository repository = container.select(SonicRepository.class).get(); for (int index = 0; index < 5; index++) { Sonic sonic = Sonic.of(faker); repository.save(sonic); } long count = repository.count(); System.out.println("Total of Sonic characters: " + count); System.out.println("Find all Sonic characters: " + repository.findAll().toList()); } System.exit(0); } This code uses the SonicRepository to save Sonic characters specifically. It showcases how to work with a repository dedicated to a particular character type. With these repositories, you can easily manage, query, and filter game characters based on their type, simplifying the code and making it more organized. Conclusion In this article, we explored the seamless integration of MongoDB with Java using the Eclipse JNoSQL framework for efficient game character management. We delved into the intricacies of modeling game characters, addressing challenges related to inheritance in NoSQL databases while maintaining compatibility with Java's object-oriented principles. By employing discriminator columns, we could categorize characters and store them within the MongoDB database, creating a well-structured and extensible solution. Through our Java application, we demonstrated how to generate sample game character data using the Data Faker library and efficiently insert it into MongoDB. We performed essential operations, such as counting the number of game characters and retrieving specific character types. Moreover, we introduced the concept of repositories in Eclipse JNoSQL, showcasing their value in simplifying data management and enabling focused queries based on character types. This article provides a solid foundation for harnessing the power of Eclipse JNoSQL and MongoDB to streamline NoSQL database interactions in Java applications, making it easier to manage and manipulate diverse data sets. Source code
Concurrency is one of the most complex problems we (developers) can face in our daily work. Additionally, it is also one of the most common problems that we may face while solving our day-to-day issues. The combination of both these factors is what truly makes concurrency and multithreading the most dangerous issues software engineers may encounter. What is more, solving concurrency problems with low-level abstractions can be quite a cognitive challenge and lead to complex, nondeterministic errors. That is why most languages introduce higher-level abstractions that allow us to solve concurrency-related problems with relative ease and not spend time tweaking low-level switches. In this article, I would like to dive deeper into such abstractions provided by the Java standard library, namely the ExecutorService interface and its implementations. I also want this article to be an entry point for my next article about benchmarking Java Streams. Before that, let's have a quick recap on processes and threads: What Is a Process? It is the simplest unit that can be executed on its own inside our computers. Thanks to the processes, we can split the work being done inside our machines into smaller, more modular, and manageable parts. Such an approach allows us to make particular parts more focused and thus more performant. A split like that also makes it possible to take full advantage of multiple cores build-in our CPUs. In general, each process is an instance of a particular program — for example, our up-and-running Java process is a JVM program instance. Moreover, each process is rooted inside the OS and has its own unique set of resources, accesses, and behavior (the program code) — similar to our application users. Each process can have several threads (at least in most OS) that work together toward the completion of common tasks assigned by the process. What Is a Thread? It can be viewed as a branch of our code with a specific set of instructions that is executed in parallel to the work made by the rest of the application. Threads enable concurrent execution of multiple sequences of instructions within a single process. On the software level, we can differentiate two types of threads: Kernel (System) threads: Threads are managed by the OS directly. The operating system kernel performs thread creation, scheduling, and management. Application (User) threads: Threads are managed at the user level by a thread library or runtime environment, independent of the operating system. They are not visible to the operating system kernel. Thus, the operating system manages them as if they were single-threaded processes. Here, I will focus mostly on application threads. I will also mention CPU-related threads. These are the hardware threads, a trait of our CPUs. Its number describes the capabilities of our CPU to handle multiple threads simultaneously. In principle, threads can share resources in a much lighter fashion than the processes. All threads within the process have access to all the data owned by its parent process. Additionally, each thread can have its data known more commonly as thread-local variables (or, in the case of Java, newer and more recommended scoped values). What is more, switching between threads is much easier than in cases of processes. What Is a Thread Pool? A thread pool is a more concrete term than both thread and process. It is related to application threads and describes a set of such threads that we can use inside our application. It works based on a very simple behavior. We just take threads one by one from the pool until the pool is empty. That’s it. However, there is an additional assumption to this rule, in particular, that the threads will be returned to the pool once their task is completed. Of course, applications may have more than one thread pool, and in fact, the more specialized our thread pool is, the better for us. With such an approach, we can limit contention within the application and remove single points of failure. The industry standard nowadays is at least to have a separate thread pool for database connection. Threads, ThreadPools, and Java In older versions of Java — before Java 21 — all threads used inside the application were bound to CPU threads. Thus, they were quite expensive and heavy. If by accident (or intent), you will spawn too many threads in your Java application; for example, via calling the “new Thread()”. Then you can very quickly run out of resources, and the performance of your application will decrease rapidly — as among others, the CPU needs to do a lot of context switching. Project Loom, the part of the Java 21 release, aimed to address this issue by adding virtual threads — the threads that are not bound to CPU threads, so-called green threads — to the Java standard library. If you would like to know more about Loom and the changes it brings to Java threads I recommend this article. In Java, the concept of the thread pools is implemented by a ThreadPoolExecutor — a class that represents a thread pool of finite size with the upper bound described by the maximumPoolSize parameter of the class constructor. As a side note, I would like to add that this executor is used further in more complex executors as an internal thread pool. Executor, ExecutorService, and Executors Before we move to describe a more complex implementation of executor interfaces that utilize a ThreadPoolExecutor, there is one more question I would like to answer: namely, what are the Executor and ExecutorService themselves? Executor The Executor is an interface exposing only one method executed with the following signature: void execute(Runnable command). The interface is designed to describe a very simple operation - to be exact, that the class implementing it can do such an operation: execute a provided runnable. The interface's purpose is to provide a way of decoupling task submission from the mechanics of how the task will be run. ExecutorService The ExecutorService is yet another interface, the extension of the Executor interface. It has a much more powerful contract than the Executor. With around 13 methods to override, if we decide to implement it, its main aim is to help with the management and running of asynchronous tasks via wrapping such tasks in Java Future. Additionally, the ExecutorService extends the Autocloseable interface. This allows us to use ExecutorService in try-with-resource syntax and close the resource in an ordered fashion. Executors Executors class, on the other hand, is a type of util class. It is a recommended way of spawning new instances of the executors — using the new keyword is not recommended for most of the executors. What is more, it provides the methods for creating variations of callable instances, for example, callable with static return value. With these three basic concepts described, we can move to different executor service implementations. The Executor Services As for now, the Java standard library supports 4 main implementations of the ExecutorService interface. Each one provides a set of more or less unique features. They go as follows: ThreadPoolExecutor ForkJoinPool ScheduledThreadPoolExecutor ThreadPerTaskExecutor Additionally, there are three private statics implementations in the Executors class which implement ExecutorService: DelegatedExecutorService DelegatedScheduledExecutorService AutoShutdownDelegatedExecutorService Overall, the dependency graph between the classes looks more or less like this: ThreadPoolExecutor As I said before, it is an implementation of the thread pool concept in Java. This executor represents a bounded thread pool with a dynamic number of threads. What it exactly means is that TheadPoolExecutor will use a finite number of threads, but the number of used threads will never be higher than specified on pool creation. To achieve that, ThreadPoolExecutor uses two variables: corePoolSize and maximumPoolSize. The first one — corePoolSize — describes the minimal number of threads in the pool, so even if the threads are idle, the pool will keep them alive. On the other hand, the second one — maximumPoolSize — describes, as you probably guessed by now, the maximum number of threads owned by the pool. This is our upper bound of threads inside the pool. The pool will never have more threads than the value of this parameter. Additionally, ThreadPoolExecutor uses BlockingQueue underneath to keep track of incoming tasks. ThreadPoolExecutor Behavior By default, if the current number of current up and running threads is smaller than corePoolSize, the calling of execute method will result in spawning a new thread with an incoming task as a thread’s first work to do — even if there are idle threads in the pool at the moment. If, for some reason, the pool is unable to add a new thread, the pool will move to behavior two. If the number of running threads is higher or equal to the corePoolSize or the pool was unable to spawn a new thread, the calling of the execute method will result in an attempt to add a new task to the queue: isRunning(c) && workQueue.offer(command). If the pool is still running, we add a new thread to the pool without any task first — the only case is when we are spawning a new thread without any task: addWorker(null, false);. On the contrary, if the pool is not running, we remove the new command from the queue: !isRunning(recheck) && remove(command) Then the pool rejects the command with RejectedExecutionException: reject(command);. If for some reason we cannot add a task to the queue, the pool is trying to start the new thread with a task as its first task: else if (!addWorker(command, false)). If it fails to do it, the task is rejected, and the RejectedExecutionException is thrown with a message with a similar message. Task X rejected from java.util.concurrent.ThreadPoolExecutor@3a71f4dd[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0] Above, you can see a very, very simplified visualization of ThreadPoolExecutor’s internal state. From here on out, you can expect two outcomes: either Task4 (or Task5) will be processed before submitting Task6 to the pool, or Task6 will be submitted to the pool before the end of Task4 (or Task5) processing. The first scenario is quite boring, as everything stays the same from the perspective of the executor, so I will only spend a little time on this. The second scenario is much more interesting as it will result in a change of executor state. Because the current number of running threads is smaller than corePoolSize, submitting Task6 to the executor will result in spawning a new worker for this task. The final state will look more or less like the one from below. ThreadPoolExecutor Pool Size Fixed-size pool: By setting corePoolSize and maximumPoolSize to the same value, you can essentially create a fixed-size thread pool, and the number of threads run by the pool will never go below or above the set value — at least not for long. Unbounded pool: By setting maximumPoolSize to a high enough value such as Integer.MAX_VALUE — you can make it practically unbounded. There is a practical limit of around 500 million ((2²⁹)-1) threads coming from ThreadPoolExecutor implementation. However, I bet that your machine will be down before reaching the max. If you would like to know more about the reasoning behind such a number being the limit, there is a very nice JavaDoc describing this. It is located just after the declaration of the ThreadPoolExecutor class. I will just drop a hint that it is related to the way how ThreadPoolExecutor holds its state. Spawning ThreadPoolExecutor Executors class gives you a total of 6 methods to spawn the ThreadPoolExecutor. I will describe them in the two packages, as that is how they are designed to work. public static ExecutorService newFixedThreadPool(int nThreads) public static ExecutorService newFixedThreadPool(int nThreads, ThreadFactory threadFactory) These methods create a fixed-size thread pool — size core and max are equal. Additionally, you can pass a threadFactory as an argument if you prefer not to use the default one from the standard library. Executors.newFixedThreadPool(2); Executors.newFixedThreadPool(2, Executors.defaultThreadFactory()); The batch of the next two methods: public static ExecutorService newCachedThreadPool() public static ExecutorService newCachedThreadPool(ThreadFactory threadFactory) The methods above create de facto unbounded thread pools by setting the maxPollSize to Integer.Max; the second version is similar to the FixThreadPool, allowing the passing of a customized ThreadFactory. Executors.newCachedThreadPool(); Executors.newCachedThreadPool(Executors.defaultThreadFactory()); And the last two methods: public static ExecutorService newSingleThreadExecutor() public static ExecutorService newSingleThreadExecutor(ThreadFactory threadFactory) Both methods create a thread pool that uses a single thread. What is more, methods spawn a ThreadPoolExecutors that is wrapped in AutoShutdownDelegatedExecutorService. It only exposes methods of ExecutorService: no ThreadPoolExecutor specific methods are available. Moreover, the shutdown method is overridden with the help of Cleanable and is called when it becomes Phantom reachable. ThreadPoolExecutor Adding New Tasks The default way of adding new tasks to be executed by a ThreadPoolExecutor is to use one of the versions of the submit method. On the other hand, one may also use the execute method from the Executor interface directly. However, it is not the recommended way. Using the execute method will return void, not the future: it means that you would have less control over its execution, so the choice is yours. Of course, both approaches will trigger all the logic described above with the thread pool. Runnable task = () -> System.out.print("test");Future<?> submit = executorService.submit(task); vs executorService.execute(task); ForkJoinPool ForkJoinPool is a totally separate ExecutorService implementation whose main selling point is the concept of work stealing. Work stealing is quite a complex concept worthy of a solely focused blog post. However, it can be reasonably simply described with high enough abstractions — all threads within the pool try to execute any task submitted to the pool, no matter its original owner. That is why the concept is called work stealing, as the threads “steal” each other’s work. In theory, such an approach should result in noticeable performance gains, especially if submitted tasks are small or spawn other subtasks. In the near future, I am planning to publish a separate post solely focused on work-stealing and ForkJoinFramework. Until then, you can read more about work stealing here. ForkJoinPool Behavior This executor is a part of the Java Fork/Join framework introduced in Java 7, a set of classes aimed to add more efficient parallel processing of tasks by utilizing the concept of work stealing. Currently, the framework is extensively used in the context of CompletableFuture and Streams. If you want to get the most out of the ForkJoinPool, I would recommend getting familiar with the Fork/Join framework as a whole. Then, try to switch your approach to handling such tasks to one that is more in line with Fork/Join requirements. Before going head-on into the Fork/Join framework, please do benchmarks and performance tests, as potential gains may not be as good as expected. Additionally, there is a table in ForkJoinPool Java docs describing the method one should use for interacting with ForkJoinPool to get the best results. The crucial parameter in the case of ForkJoinPool is parallelism — it describes the number of worker threads the pool will be using. By default, it is equal to the number of available processors on your CPU. In most cases, it is at least a sufficient setup, and I would recommend not changing it without proper performance tests. Keep in mind that the Java threads are CPU-bound, and we can quickly run out of processing power to progress in our task. Spawning ForkJoinPool Instances The Executors class provides two methods for spawning instances of ForkJoinPool: Executors.newWorkStealingPool(); Executors.newWorkStealingPool(2); The first one creates the forkJoinPoll with default parallelism — the number of available processors — while the second gives us the possibility to specify the parallelism level ourselves. What is more, Executors spawn ForkJoinPool with FIFO queue underneath while the default setting of ForkJoinPool itself (for example, via new ForkJoinPool(2)) is LIFO. Using LIFO with Executors is impossible. Despite that, you can change the type of underlain queue by using the asyncMode constructor parameter of the ForkJoinPool class. With the FIFO setting, the ForkJoinPool may be better suited for the tasks that are never joined — so for example, for callable or runnable usages. ScheduledThreadPoolExecutor ScheduledThreadPoolExecutor adds a layer over the classic ThreadPoolExecutor and allows for scheduling tasks. It supports three types of scheduling: Schedule once with a fixed delay - schedule Schedule with every number of time units - scheduleAtFixedRate Schedule with a fixed delay between execution - scheduleWithFixedDelay For sure, you can also use the “normal” ExecutorService API. Just remember that in this case, methods submit and execute are equal to calling the schedule method with 0 delay — instant execution of the provided task. As ScheduledThreadPoolExecutor extends ThreadPoolExecutor, some parts of its implementation are the same as in classic ThreadPoolExecutor. Nevertheless, it is using its own implementation of task ScheduledFutureTask and queue: DelayedWorkQueue. What is more, ScheduledThreadPoolExecutor always creates a fixed-size ThreadPoolExecutor as its underlying thread pool, so corePoolSize and MaxPoolSize are always equal. There is, however, a catch or two hidden inside the implementation of ScheduledThreadPoolExecutor. Firstly, if two tasks are scheduled to run (or end up scheduled to be run) at the same time, they are executed in FIFO style based on submission time. The next catch is the logical consequence of the first. There are no real guarantees that a particular task will execute at a certain point in time — as, for example, it may wait in a queue from the line above. Last but not least, if for some reason the execution of two tasks should end up overlapping each other, the thread pool guarantees that the execution of the first one will “happen before” the later one. Essentially, in FIFO fashion, even with respect to the tasks being called from different threads. Spawning ScheduledThreadPoolExecutor The Executors class gives us five ways to spawn the ScheduledThreadPoolExecutor. They are organized in a similar fashion to those of ThreadPoolExecutor. ScheduledThreadPoolExecutor with a fixed number of threads: Executors.newScheduledThreadPool(2); Executors.newScheduledThreadPool(2, Executors.defaultThreadFactory()); The first method allows us to create a ScheduledThreadPoolExecutor with a particular number of threads. The second method adds availability to pass the ThreadFactory of choice. ScheduledThreadPoolExecutor with a single thread: Executors.newSingleThreadScheduledExecutor(); Executors.newSingleThreadScheduledExecutor(Executors.defaultThreadFactory()); Create a SingleThreadScheduledExecutor whose underlying ThreadExecutorPool has only one thread. In fact, here we spawn the instance of DelegatedScheduledExecutorService, which uses ScheduledThreadPoolExecutor as a delegate, so in the end, the underlying ThreadExecutorPool of a delegate has only one thread. The last way to spawn the ScheduledThreadPoolExecutor is via using: Executors.unconfigurableScheduledExecutorService(new DummyScheduledExecutorServiceImpl()); This method allows you to wrap your own implementation of the ScheduledExecutorService interface with DelegatedScheduledExecutorService — one of the private static classes from the Executors class. This one only exposes the methods of the ScheduledExecutorService interface. To a degree, we can view it as an encapsulation helper. You can have multiple public methods within your implementation, but when you wrap it with the delegate, all of them will become hidden from the users. I am not exactly a fan of such an approach to encapsulation. It should be the problem of implementation. However, maybe I am missing some other important use cases of all of the delegates. ThreadPerTaskExecutor It is one of the newest additions to the Java standard library and Executors class. It is not a thread pool implementation, but rather a thread spawner. As the name suggests, each submitted task gets its own thread bound to its execution, the thread starts alongside the start of task processing. To achieve such behavior, this executor uses its own custom implementation of Future, namely, ThreadBoundFuture. The lifecycle of threads created by these executors looks more or less like this: The thread is created as soon as the Future is created. The thread starts working only after being programmatically started by the Executor. The thread is interrupted when the Future is interrupted. Thread is stopped on Future completion. What is more, if the Executor is not able to start a new thread for a particular task and no exception is thrown along the way, then the Executor will RejectedExecutionException. Furthermore, the ThreadPerTaskExecutor holds a set of up-threads. Every time the thread is started, it will be added to the set. Respectively, when the thread is stopped, it will be removed from the set. You can then use this set to keep track of how many threads the Executor is running at the given time via the threadCount() method. Spawning ThreadPerTaskExecutor The Executors class exposes two ways of spawning this Executor. I would say that one is more recommended than the other. Let’s start with the not-recommended one: Executors.newThreadPerTaskExecutor(Executors.defaultThreadFactory()); The above method spawns ThreadPerTaskExecutor with the provided thread factory. The reason why it is not recommended, at least by me, is that the instance ThreadPerTaskExecutor will operate on plain old Java CPU-bound threads. In such a case, if you put a high enough number of tasks through the Executor, you can very, very easily run out of processing power for your application. Certainly, nothing stands in your way of doing the following “trick” and use virtual threads anyway. Executors.newThreadPerTaskExecutor(Thread.ofVirtual().factory()); However, there is no reason to do that when you simply can use the following: Executors.newVirtualThreadPerTaskExecutor(); This instance of ThreadPerTaskExecutor will take full advantage of virtual threads from Java 21. Such a setting should also greatly increase the number of tasks that your Executor will be able to handle it before running out of processing power. Summary As you can see, Java provides a set of different executors, from classic ThreadPool implementation via ThreadPoolExecutor to more complex ones like ThreadPerTaskExecutor, which takes full advantage of virtual threads, a feature from Java 21. What is more, each Executor implementation has its unique trait: ThreadPoolExecutor: Classic ThreadPool implementation ForkJoinPool: Work-stealing ScheduledThreadPoolExecutor: Periodical scheduling of tasks ThreadPerTaskExecutor: Usage of virtual threads and the possibility to run a task in its separate short-lived thread Despite that difference, all of the executors have one defining trait: all of them expose an API that makes concurrent processing of multiple tasks much easier. I hope that this knowledge will become useful for you sometime in the future. Thank you for your time. Note: Thank you to Michał Grabowski and Krzysztof Atlasik for the review.
As part of learning the Rust ecosystem, I dedicated the last few days to error management. Here are my findings. Error Management 101 The Rust book describes the basics of error management. The language separates between recoverable errors and unrecoverable ones. Unrecoverable errors benefit from the panic!() macro. When Rust panics, it stops the program. Recoverable errors are much more enjoyable. Rust uses the Either monad, which stems from Functional Programming. Opposite to exceptions in other languages, FP mandates to return a structure that may contain either the requested value or the error. The language models it as an enum with generics on each value: Rust #[derive(Copy, PartialEq, PartialOrd, Eq, Ord, Debug, Hash)] pub enum Result<T, E> { Ok(T), Err(E), } Because Rust manages completeness of matches, matching on Result enforces that you handle both branches: Rust match fn_that_returns_a_result() { Ok(value) => do_something_with_value(value) Err(error) => handle_error(error) } If you omit one of the two branches, compilation fails. The above code is safe if unwieldy. But Rust offers a full-fledged API around the Result struct. The API implements the monad paradigm. Propagating Results Propagating results and errors is one of the main micro-tasks in programming. Here's a naive way to approach it: Rust #[derive(Debug)] struct Foo {} #[derive(Debug)] struct Bar { foo: Foo } #[derive(Debug)] struct MyErr {} fn main() { print!("{:?}", a(false)); } fn a(error: bool) -> Result<Bar, MyErr> { match b(error) { //1 Ok(foo) => Ok(Bar{ foo }), //2 Err(my_err) => Err(my_err) //3 } } fn b(error: bool) -> Result<Foo, MyErr> { if error { Err(MyErr {}) } else { Ok(Foo {}) } } Return a Result which contains a Bar or a MyErr If the call is successful, unwrap the Foo value, wrap it again, and return it If it isn't, unwrap the error, wrap it again, and return it The above code is a bit verbose, and because this construct is quite widespread, Rust offers the ? operator: When applied to values of the Result type, it propagates errors. If the value is Err(e), then it will return Err(From::from(e)) from the enclosing function or closure. If applied to Ok(x), then it will unwrap the value to evaluate to x. —The question mark operator We can apply it to the above a function: Rust fn a(error: bool) -> Result<Bar, MyErr> { let foo = b(error)?; Ok(Bar{ foo }) } The Error Tait Note that Result enforces no bound on the right type, the "error" type. However, Rust provides an Error trait. Two widespread libraries help us manage our errors more easily. Let's detail them in turn. Implement Error Trait With thiserror In the above section, I described how a struct could implement the Error trait. However, doing so requires quite a load of boilerplate code. The thiserror crate provides macros to write the code for us. Here's the documentation sample: Rust #[derive(Error, Debug)] //1 pub enum DataStoreError { #[error("data store disconnected")] //2 Disconnect(#[from] io::Error), #[error("the data for key `{0}` is not available")] //3 Redaction(String), #[error("invalid header (expected {expected:?}, found {found:?})")] //4 InvalidHeader { expected: String, found: String, } } Base Error macro Static error message Dynamic error message, using field index Dynamic error message, using field name thiserror helps you generate your errors. Propagate Result With anyhow The anyhow crate offers several features: A custom anyhow::Result struct. I will focus on this one A way to attach context to a function returning an anyhow::Result Additional backtrace environment variables Compatibility with thiserror A macro to create errors on the fly Result propagation has one major issue: functions signature across unrelated error types. The above snippet used a single enum, but in real-world projects, errors may come from different crates. Here's an illustration: Rust #[derive(thiserror::Error, Debug)] pub struct ErrorX {} //1 #[derive(thiserror::Error, Debug)] pub struct ErrorY {} //1 fn a(flag: i8) -> Result<Foo, Box<dyn std::error::Error>> { //2 match flag { 1 => Err(ErrorX{}.into()), //3 2 => Err(ErrorY{}.into()), //3 _ => Ok(Foo{}) } } Two error types, each implemented with a different struct with thiserror Rust needs to know the size of the return type at compile time. Because the function can return either one or the other type, we must return a fixed-sized pointer; that's the point of the Box construct. For a discussion on when to use Box compared to other constructs, please read this StackOverflow question. To wrap the struct into a Box, we rely on the into() method With anyhow, we can simplify the above code: Rust fn a(flag: i8) -> anyhow::Result<Foo> { match flag { 1 => Err(ErrorX{}.into()), 2 => Err(ErrorY{}.into()), _ => Ok(Foo{}) } With the Context trait, we can improve the user experience with additional details. The with_context() method is evaluated lazily, while the context() is evaluated eagerly. Here's how you can use the latter: Rust fn a(flag: i8) -> anyhow::Result<Bar> { let foo = b(flag).context(format!("Oopsie! {}", flag))?; //1 Ok(Bar{ foo }) } fn b(flag: i8) -> anyhow::Result<Foo> { match flag { 1 => Err(ErrorX{}.into()), 2 => Err(ErrorY{}.into()), _ => Ok(Foo{}) } } If the function fails, print the additional Oopsie! error message with the flag value Conclusion Rust implements error handling via the Either monad of FP and the Result enum. Managing such code in bare Rust requires boilerplate code. The thiserror crate can easily implement the Error trait for your structs, while anyhow simplifies function and method signatures. To Go Further Rust error handling The Error trait anyhow crate thiserror crate What is the difference between context and with_context in anyhow? Error handling across different languages
It’s been more than 20 years since Spring Framework appeared in the software development landscape and 10 since Spring Boot version 1.0 was released. By now, nobody should have any doubt that Spring has created a unique style through which developers are freed from repetitive tasks and left to focus on business value delivery. As years passed, Spring’s technical depth has continually increased, covering a wide variety of development areas and technologies. On the other hand, its technical breadth has been continually expanded as more focused solutions have been experimented, proof of concepts created, and ultimately promoted under the projects’ umbrella (towards the technical depth). One such example is the new Spring AI project which, according to its reference documentation, aims to ease the development when a generative artificial intelligence layer is aimed to be incorporated into applications. Once again, developers are freed from repetitive tasks and offered simple interfaces for direct interaction with the pre-trained models that incorporate the actual processing algorithms. By interacting with generative pre-trained transformers (GPTs) directly or via Spring AI programmatically, users (developers) do not need to (although it would be useful) possess extensive machine learning knowledge. As an engineer, I strongly believe that even if such (developer) tools can be rather easily and rapidly used to produce results, it is advisable to temper ourselves to switch to a watchful mode and try to gain a decent understanding of the base concepts first. Moreover, by following this path, the outcome might be even more useful. Purpose This article shows how Spring AI can be integrated into a Spring Boot application and fulfill a programmatic interaction with Open AI. It is assumed that prompt design in general (prompt engineering) is a state-of-the-art activity. Consequently, the prompts used during experimentation are quite didactic, without much applicability. The focus here is on the communication interface, that is, Spring AI API. Before the Implementation First and foremost, one shall clarify the rationale for incorporating and utilizing a GPT solution, in addition to the desire to deliver with greater quality, in less time, and with lower costs. Generative AI is said to be good at doing a great deal of time-consuming tasks, quicker and more efficiently, and outputting the results. Moreover, if these results are further validated by experienced and wise humans, the chances of obtaining something useful increase. Fortunately, people are still part of the scenery. Next, one shall resist the temptation to jump right into the implementation and at least dedicate some time to get a bit familiar with the general concepts. An in-depth exploration of generative AI concepts is way beyond the scope of this article. Nevertheless, the “main actors” that appear in the interaction are briefly outlined below. The Stage – Generative AI is part of machine learning that is part of artificial intelligence Input – The provided data (incoming) Output – The computed results (outgoing) Large Language Model(LLM) – The fine-tuned algorithm based on the interpreted input produces the output Prompt – A state-of-the-art interface through which the input is passed to the model Prompt Template – A component that allows constructing structured parameterized prompts Tokens – The components the algorithm internally translates the input into, then uses to compile the results and ultimately constructs the output from Model’s context window – The threshold the model limits the number of tokens counts per call (usually, the more tokens are used, the more expensive the operation is) Finally, an implementation may be started, but as it progresses, it is advisable to revisit and refine the first two steps. Prompts In this exercise, we ask for the following: Plain Text Write {count = three} reasons why people in {location = Romania} should consider a {job = software architect} job. These reasons need to be short, so they fit on a poster. For instance, "{job} jobs are rewarding." This basically represents the prompt. As advised, a clear topic, a clear meaning of the task, and additional helpful pieces of information should be provided as part of the prompts, in order to increase the results’ accuracy. The prompt contains three parameters, which allow coverage for a wide range of jobs in various locations. count – The number of reasons aimed as part of the output job – The domain, the job interested in location – The country, town, region, etc. the job applicants reside Proof of Concept In this post, the simple proof of concept aims the following: Integrate Spring AI in a Spring Boot application and use it. Allow a client to communicate with Open AI via the application. The client issues a parametrized HTTP request to the application. The application uses a prompt to create the input, sends it to Open AI retrieves the output. The application sends the response to the client. Setup Java 21 Maven 3.9.2 Spring Boot – v. 3.2.2 Spring AI – v. 0.8.0-SNAPSHOT (still developed, experimental) Implementation Spring AI Integration Normally, this is a basic step not necessarily worth mentioning. Nevertheless, since Spring AI is currently released as a snapshot, in order to be able to integrate the Open AI auto-configuration dependency, one shall add a reference to Spring Milestone/Snapshot repositories. XML <repositories> <repository> <id>spring-milestones</id> <name>Spring Milestones</name> <url>https://repo.spring.io/milestone</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>spring-snapshots</id> <name>Spring Snapshots</name> <url>https://repo.spring.io/snapshot</url> <releases> <enabled>false</enabled> </releases> </repository> </repositories> The next step is to add the spring-ai-openai-spring-boot-starter Maven dependency. XML <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId> <version>0.8.0-SNAPSHOT</version> </dependency> Open AI ChatClient is now part of the application classpath. It is the component used to send the input to Open AI and retrieve the output. In order to be able to connect to the AI Model, the spring.ai.openai.api-key property needs to be set up in the application.properties file. Properties files spring.ai.openai.api-key = api-key-value Its value represents a valid API Key of the user on behalf of which the communication is made. By accessing the Open AI Platform, one can either sign up or sign in and generate one. Client: Spring Boot Application Communication The first part of the proof of concept is the communication between a client application (e.g., browser, cURL, etc.) and the application developed. This is done via a REST controller, accessible via an HTTP GET request. The URL is /job-reasons together with the three parameters previously outlined when the prompt was defined, which conducts to the following form: Plain Text /job-reasons?count={count}&job={job}&location={location} And the corresponding controller: Java @RestController public class OpenAiController { @GetMapping("/job-reasons") public ResponseEntity<String> jobReasons(@RequestParam(value = "count", required = false, defaultValue = "3") int count, @RequestParam("job") String job, @RequestParam("location") String location) { return ResponseEntity.ok().build(); } } Since the response from Open AI is going to be a String, the controller returns a ResponseEntity that encapsulates a String. If we run the application and issue a request, currently nothing is returned as part of the response body. Client: Open AI Communication Spring AI currently focuses on AI Models that process language and produce language or numbers. Examples of Open AI models in the former category are GPT4-openai or GPT3.5-openai. For fulfilling an interaction with these AI Models, which actually designate Open AI algorithms, Spring AI provides a uniform interface. ChatClient interface currently supports text input and output and has a simple contract. Java @FunctionalInterface public interface ChatClient extends ModelClient<Prompt, ChatResponse> { default String call(String message) { Prompt prompt = new Prompt(new UserMessage(message)); return call(prompt).getResult().getOutput().getContent(); } ChatResponse call(Prompt prompt); } The actual method of the functional interface is the one usually used. In the case of our proof of concept, this is exactly what is needed: a way of calling Open AI and sending the aimed parametrized Prompt as a parameter. The following OpenAiService is defined where an instance of ChatClient is injected. Java @Service public class OpenAiService { private final ChatClient client; public OpenAiService(OpenAiChatClient aiClient) { this.client = aiClient; } public String jobReasons(int count, String domain, String location) { final String promptText = """ Write {count} reasons why people in {location} should consider a {job} job. These reasons need to be short, so they fit on a poster. For instance, "{job} jobs are rewarding." """; final PromptTemplate promptTemplate = new PromptTemplate(promptText); promptTemplate.add("count", count); promptTemplate.add("job", domain); promptTemplate.add("location", location); ChatResponse response = client.call(promptTemplate.create()); return response.getResult().getOutput().getContent(); } } With the application running, if the following request is performed, from the browser: Plain Text http://localhost:8080/gen-ai/job-reasons?count=3&job=software%20architect&location=Romania Then the below result is retrieved: Lucrative career: Software architect jobs offer competitive salaries and excellent growth opportunities, ensuring financial stability and success in Romania. In-demand profession: As the demand for technology continues to grow, software architects are highly sought after in Romania and worldwide, providing abundant job prospects and job security. Creative problem-solving: Software architects play a crucial role in designing and developing innovative software solutions, allowing them to unleash their creativity and make a significant impact on various industries. This is exactly what it was intended – an easy interface through which the Open AI GPT model can be asked to write a couple of reasons why a certain job in a certain location is appealing. Adjustments and Observations The simple proof of concept developed so far mainly uses the default configurations available. The ChatClient instance may be configured according to the desired needs via various properties. As this is beyond the scope of this writing, only two are exemplified here. spring.ai.openai.chat.options.model designates the AI Model to use. By default, it is "gpt-35-turbo," but "gpt-4" and "gpt-4-32k" designate the latest versions. Although available, one may not be able to access these using a pay-as-you-go plan, but there are additional pieces of information available on the Open AI website to accommodate it. Another property worth mentioning is spring.ai.openai.chat.options.temperature. According to the reference documentation, the sampling temperature controls the “creativity of the responses." It is said that higher values make the output “more random," while lower ones are “more focused and deterministic." The default value is 0.8, if we decrease it to 0.3, restart the application, and ask again with the same request parameters, the below result is retrieved. Lucrative career opportunities: Software architect jobs in Romania offer competitive salaries and excellent growth prospects, making it an attractive career choice for individuals seeking financial stability and professional advancement. Challenging and intellectually stimulating work: As a software architect, you will be responsible for designing and implementing complex software systems, solving intricate technical problems, and collaborating with talented teams. This role offers continuous learning opportunities and the chance to work on cutting-edge technologies. High demand and job security: With the increasing reliance on technology and digital transformation across industries, the demand for skilled software architects is on the rise. Choosing a software architect job in Romania ensures job security and a wide range of employment options, both locally and internationally. It is visible that the output is way more descriptive in this case. One last consideration is related to the structure of the output obtained. It would be convenient to have the ability to map the actual payload received to a Java object (class or record, for instance). As of now, the representation is textual and so is the implementation. Output parsers may achieve this, similarly to Spring JDBC’s mapping structures. In this proof of concept, a BeanOutputParser is used, which allows deserializing the result directly in a Java record as below: Java public record JobReasons(String job, String location, List<String> reasons) { } This is done by taking the {format} as part of the prompt text and providing it as an instruction to the AI Model. The OpenAiService method becomes: Java public JobReasons formattedJobReasons(int count, String job, String location) { final String promptText = """ Write {count} reasons why people in {location} should consider a {job} job. These reasons need to be short, so they fit on a poster. For instance, "{job} jobs are rewarding." {format} """; BeanOutputParser<JobReasons> outputParser = new BeanOutputParser<>(JobReasons.class); final PromptTemplate promptTemplate = new PromptTemplate(promptText); promptTemplate.add("count", count); promptTemplate.add("job", job); promptTemplate.add("location", location); promptTemplate.add("format", outputParser.getFormat()); promptTemplate.setOutputParser(outputParser); final Prompt prompt = promptTemplate.create(); ChatResponse response = client.call(prompt); return outputParser.parse(response.getResult().getOutput().getContent()); } When invoking again, the output is as below: JSON { "job":"software architect", "location":"Romania", "reasons":[ "High demand", "Competitive salary", "Opportunities for growth" ] } The format is the expected one, but the reasons appear less explanatory, which means additional adjustments are required in order to achieve better usability. From a proof of concept point of view though, this is acceptable, as the focus was on the form. Conclusions Prompt design is an important part of the task – the better articulated prompts are, the better the input and the higher the output quality. Using Spring AI to integrate with various chat models is quite straightforward – this post showcased an Open AI integration. Nevertheless, in the case of Gen AI in general, just as in the case of almost any technology, it is very important to get familiar at least with the general concepts first. Then, to try to understand the magic behind the way the communication is carried out and only afterward, start writing “production” code. Last but not least, it is advisable to further explore the Spring AI API to understand the implementations and remain up-to-date as it evolves and improves. The code is available here. References Spring AI Reference
It wasn't long ago that I decided to ditch my Ubuntu-based distros for openSUSE, finding LEAP 15 to be a steadier, more rock-solid flavor of Linux for my daily driver. The trouble is, I hadn't yet been introduced to Linux Mint Debian Edition (LMDE), and that sound you hear is my heels clicking with joy. LMDE 6 with the Cinnamon desktop. Allow me to explain. While I've been a long-time fan of Ubuntu, in recent years, it's the addition of snaps (rather than system packages) and other Ubuntu-only features started to wear on me. I wanted straightforward networking, support for older hardware, and a desktop that didn't get in the way of my work. For years, Ubuntu provided that, and I installed it on everything from old netbooks, laptops, towers, and IoT devices. More recently, though, I decided to move to Debian, the upstream Linux distro on which Ubuntu (and derivatives like Linux Mint and others) are built. Unlike Ubuntu, Debian holds fast to a truly solid, stable, non-proprietary mindset — and I can still use the apt package manager I've grown accustomed to. That is, every bit of automation I use (Chef and Ansible mostly) works the same on Debian and Ubuntu. I spent some years switching back and forth between the standard Ubuntu long-term releases and Linux Mint, a truly great Ubuntu-derived desktop Linux. Of course, there are many Debian-based distributions, but I stumbled across LMDE version 6, based on Debian GNU/Linux 12 "Bookworm" and known as Faye, and knew I was onto something truly special. As with the Ubuntu version, LMDE comes with different desktop environments, including the robust Cinnamon, which provides a familiar environment for any Linux, Windows, or macOS user. It's intuitive, chock full of great features (like a multi-function taskbar), and it supports a wide range of customizations. However, it includes no snaps or other Ubuntuisms, and it is amazingly stable. That is, I've not had a single freeze-up or odd glitch, even when pushing it hard with Kdenlive video editing, KVM virtual machines, and Docker containers. According to the folks at Linux Mint, "LMDE is also one of our development targets, as such it guarantees the software we develop is compatible outside of Ubuntu." That means if you're a traditional Linux Mint user, you'll find all the familiar capabilities and features in LMDE. After nearly six months of daily use, that's proven true. As someone who likes to hang on to old hardware, LMDE extended its value to me by supporting both 64- and 32-bit systems. I've since installed it on a 2008 Macbook (32-bit), old ThinkPads, old Dell netbooks, and even a Toshiba Chromebook. Though most of these boxes have less than 3 gigabytes of RAM, LMDE performs well. Cinnamon isn't the lightest desktop around, but it runs smoothly on everything I have. The running joke in the Linux world is that "next year" will be the year the Linux desktop becomes a true Windows and macOS replacement. With Debian Bookworm-powered LMDE, I humbly suggest next year is now. To be fair, on some of my oldest hardware, I've opted for Bunsen. It, too, is a Debian derivative with 64- and 32-bit versions, and I'm using the BunsenLabs Linux Boron version, which uses the Openbox window manager and sips resources: about 400 megabytes of RAM and low CPU usage. With Debian at its core, it's stable and glitch-free. Since deploying LMDE, I've also begun to migrate my virtual machines and containers to Debian 12. Bookworm is amazingly robust and works well on IoT devices, LXCs, and more. Since it, too, has long-term support, I feel confident about its stability — and security — over time. If you're a fan of Ubuntu and Linux Mint, you owe it to yourself to give LMDE a try. As a daily driver, it's truly hard to beat.