Monday 26 June 2023

Difference between internal working of HashMap in JAVA 7 And JAVA 8

In both Java 7 and Java 8, the HashMap class is used to store key-value pairs. However, there are some differences in the internal working of HashMap between these two versions.

In Java 7, the internal structure of a HashMap consists of an array of linked lists, where each array index represents a bucket. The hash code of the key is used to determine the bucket where the key-value pair will be stored. If multiple key-value pairs map to the same bucket, they are stored as nodes in a linked list at that bucket. This mechanism is known as separate chaining.

Let's consider an example to illustrate this. Suppose we have a HashMap in Java 7 with four buckets and the following key-value pairs:


Key: "Apple" -> Value: "Red" Key: "Banana" -> Value: "Yellow" Key: "Grape" -> Value: "Purple"


The hash code of each key is computed, and based on that, the corresponding bucket is determined. In this example, let's assume that the hash codes are as follows:

The HashMap in Java 7 would internally look like this


Bucket 0: Bucket 1: Bucket 2: Bucket 3: ("Apple" -> "Red") -> ("Grape" -> "Purple") Bucket 4: Bucket 5: Bucket 6: Bucket 7: ("Banana" -> "Yellow")

When retrieving a value from the HashMap in Java 7, the hash code of the key is used to find the bucket. Then, the linked list at that bucket is traversed to find the matching key-value pair.

In Java 8, the HashMap class introduced a new mechanism called tree bins to handle situations where a bucket contains a large number of colliding entries. When a bucket exceeds a certain threshold, it converts into a balanced binary tree instead of using a linked list. This structure, known as a red-black tree, allows for more efficient searching and retrieval of key-value pairs.

Continuing with the previous example, if we add more key-value pairs to the HashMap


Key: "Lemon" -> Value: "Yellow" Key: "Orange" -> Value: "Orange" Key: "Cherry" -> Value: "Red"

The HashMap in Java 8 would internally look like this:


Bucket 0: Bucket 1: Bucket 2: Bucket 3: ("Apple" -> "Red") -> ("Grape" -> "Purple") Bucket 4: Bucket 5: Bucket 6: Bucket 7: ("Banana" -> "Yellow") Bucket 8: ("Cherry" -> "Red") -> ("Lemon" -> "Yellow") -> ("Orange" -> "Orange")

As you can see, the bucket that initially contained three key-value pairs ("Cherry", "Lemon", and "Orange") has been transformed into a red-black tree, making it more efficient to search for specific keys within that bucket.

When retrieving a value from the HashMap in Java 8, the hash code of the key is used to find the bucket. If the bucket is a linked list, the key-value pairs are compared using their hash codes and, if necessary, their actual values to find the match. If the bucket is a red-black tree, a binary search is performed


Sunday 25 June 2023

Stateless vs Stateful web services

Stateless and stateful web services are two architectural paradigms for designing and implementing web services. The main difference between them lies in how they handle and manage client session information.



Stateless Web Services:

Stateless web services do not maintain any session or context information about the client between multiple requests. Each request is treated independently, and the server does not retain any knowledge of previous requests from the same client.

Characteristics of Stateless Web Services:

  • No session information is stored on the server.
  • Each request is self-contained and independent.
  • Scalability is generally better since there is no need to manage and synchronize session state.
  • Suitable for scenarios where requests can be handled in isolation, without relying on past interactions.
  • Stateless services are generally easier to implement and test.

Example: RESTful APIs are commonly designed as stateless web services. In a RESTful API, each request from the client contains all the necessary information for the server to process it. The server does not maintain any session state. The client includes any required authentication tokens or data in the request headers, allowing the server to authenticate and process the request without relying on previous interactions.

Stateful Web Services:

Stateful web services maintain session or context information about the client across multiple requests. The server keeps track of the client's state and uses it to handle subsequent requests from the same client. Session information is typically stored on the server or in a shared session store.

Characteristics of Stateful Web Services:

  • Session information is stored on the server.
  • Server maintains state information for each client, allowing it to provide personalized services.
  • Clients are associated with a session identifier or token to retrieve their session data.
  • More complex to implement and maintain due to managing and synchronizing session state.
  • Can be beneficial in scenarios where maintaining state information is necessary, such as shopping carts, user authentication, and multi-step processes.

Example: An e-commerce website that allows users to add items to a shopping cart is an example of a stateful web service. The server stores the user's shopping cart information, and the user can perform multiple operations, such as adding or removing items, within the same session. The server retrieves and updates the session data with each request to provide a consistent shopping experience.

In summary, the choice between stateless and stateful web services depends on the requirements of the application. Stateless services are simpler, scalable, and suitable for scenarios where requests can be handled independently. Stateful services, on the other hand, provide more personalized and persistent interactions but require additional management of session state.

Saturday 24 June 2023

What are the terminal and intermediate operations in java streams?

 In Java streams, intermediate and terminal operations are the two main types of operations that can be performed on a stream.

  1. Intermediate Operations:
  2. Intermediate operations are operations that are applied to a stream and produce a new stream as a result. These operations are typically used for transforming, filtering, or sorting the elements of a stream. Intermediate operations are lazy, meaning they are not executed until a terminal operation is invoked on the stream.




Some common intermediate operations in Java streams include:

  • map: Applies a function to each element and transforms it into another type.
  • filter: Filters out elements based on a specified condition.
  • sorted: Sorts the elements of the stream based on a comparator.
  • distinct: Removes duplicate elements from the stream.
  • limit: Limits the number of elements in the stream to a specified size.
  • flatMap: Transforms each element into a stream and flattens the result.
  1. Terminal Operations:
  2. Terminal operations are operations that are applied to a stream and produce a non-stream result, such as a value, a collection, or a side effect. When a terminal operation is invoked, the intermediate operations are executed on the stream elements to produce the final result.

Some common terminal operations in Java streams include:

  • forEach: Performs an action on each element of the stream.
  • collect: Collects the elements of the stream into a collection or a single value.
  • reduce: Combines the elements of the stream into a single value using a specified operation.
  • count: Returns the count of elements in the stream.
  • anyMatch, allMatch, noneMatch: Check if any, all, or none of the elements satisfy a given condition.
  • min, max: Returns the minimum or maximum element of the stream based on a comparator.

It's important to note that a stream pipeline typically consists of a sequence of intermediate operations followed by a terminal operation. The intermediate operations are executed in a lazy manner, only when the terminal operation is triggered. This lazy evaluation allows for optimized and efficient processing of large data sets.






Friday 23 June 2023

What is count min sketch algorithm ?

 The Count-Min Sketch algorithm is a probabilistic data structure used for estimating the frequency of elements in a stream of data. It was introduced by Cormode and Muthukrishnan in 2005 as a memory-efficient and scalable approach to approximate counting.

The Count-Min Sketch consists of a two-dimensional array of counters, typically implemented as an array of hash tables. Each hash table corresponds to a different hash function. When an element is encountered in the stream, it is hashed multiple times using different hash functions, and the corresponding counters in the hash tables are incremented.




Here's a high-level overview of the Count-Min Sketch algorithm:

  1. Initialization: Create a two-dimensional array of counters with dimensions width x depth, initially set to zero.
  2. Hashing: Choose a set of k hash functions, each mapping elements to one of the counters in the array.
  3. Incrementing: When an element is encountered in the stream, hash it using each of the k hash functions. Increment the corresponding counters in the array for each hash function.
  4. Estimation: To estimate the frequency of an element, hash it using the same set of k hash functions. Retrieve the minimum value among the counters associated with the hashed positions.

The Count-Min Sketch provides an approximation of element frequencies in the stream. It guarantees that the estimated frequency is always greater than or equal to the actual frequency. However, due to collisions and limited counter sizes, there is a probability of overestimation. The accuracy of the estimation depends on the width and depth of the sketch.

Count-Min Sketches are particularly useful when memory usage is a concern or when processing large-scale data streams. They are commonly employed in areas such as network traffic monitoring, approximate query processing, frequency analysis, and streaming algorithms.






Thursday 22 June 2023

What is Rsync algorithm?

The Rsync algorithm is a file synchronization and transfer algorithm that efficiently detects and transfers the differences between two files or directories. It was developed by Andrew Tridgell and Paul Mackerras in 1996.




The key idea behind the Rsync algorithm is the concept of delta encoding. Instead of transferring an entire file, Rsync identifies the portions of the file that have changed and transfers only those differences (called deltas). This makes the algorithm particularly efficient when transferring large files or synchronizing files over a network.




The Rsync algorithm operates in two phases: the sender and the receiver.

Sender Phase:

  1. The sender breaks the source file into fixed-size blocks (typically 2KB or 4KB) and calculates a rolling checksum for each block. The rolling checksum is a hash function that produces a fixed-size checksum for each block based on its content.
  2. The sender sends the list of checksums along with the corresponding block positions to the receiver.

Receiver Phase:

  1. The receiver compares the checksums received from the sender with the checksums of the blocks in the destination file.
  2. If a checksum match is found, it means that the block is already present in the destination file, and the receiver skips it.
  3. If a checksum mismatch occurs, the receiver identifies the differing blocks and requests the sender to transmit only those blocks.
  4. The sender sends the requested blocks, and the receiver integrates them into the destination file, reconstructing the updated file.

This process continues until the receiver has synchronized the entire file or directory with the sender.

The Rsync algorithm's efficiency lies in its ability to minimize the amount of data transmitted by transferring only the differences between files. This makes it an excellent choice for efficient file synchronization, remote backups, and network transfers with limited bandwidth.