Strings in Java I: What is a String?

Strings in Java are a reference data type which comprise of sequence of unicode characters. Strings are an extremely important data structure in any language. Java recognized that strings are ubiquitous and therefore gave them a first class status. Unlike some other languages where strings are arrays of primitive characters, strings in Java are objects and are pretty sophisticated. Java strings are objects (reference data types)  and have methods that can manipulate the data. Raw strings, which are a just an array of characters, such as the one in C programming language lack these manipulation methods. That makes Java so much more a preferred tool for implementing string manipulation (such as XML parsing, DOM parsing) algorithms.

What is a String in Java?

A string in Java is a constant, an immutable object comprising of a sequence of Unicode characters and methods to perform operations on its characters.

“” is a valid Java string. Java supports unicode characters and therefore the following Hindi translation of – “कोडिंगरैप्टर। कॉम”  – is also a valid Java string.

Creating Java Strings

Java allows you to create strings in two different ways –

  1. By using string literal
  2. By using new operator (we will see later in our lessons object oriented programming that new is an operator, not a statement or just a ‘keyword’)

Using String Literal

A string literal, as noted in our previous lesson on literals, is a sequence of characters enclosed in double quotes. For example, the expression “” is a string literal.

String literals are unique. We have already seen in our lesson JVM 101 that new objects are created on heap. But String literals are created in constant pool. As it name indicates, constant pool stores the constants. Storing constants separately has several performance advantages which we discuss below.

Upto JDK6 constant pool used to be outside regular heap and stack. Constant pool was part of memory area called method area. But beginning with JDK7, constant pool for strings (now mostly referred to as string pool) became part of heap. Since heap is much more plentiful than the method area, we can now store more string literals in the constant pool before garbage collection kicks in.

When you use a literal somewhere in your code for the first time, Java creates the String literal in the constant pool. When the same string is encountered again, no new object is created. The String literal (object and not the reference) is used again. Thus creation of new string is avoided. Each usage of the string literal refers to the same memory portion of the constant pool.

Consider the code below

The first line creates a string literal with value ““. This is followed by a print statement containing two string literals “I trust ” and ““. When Java encounters “I trust ” it creates another string literal. On encountering “”  Java checks in the constant pool if the literal already exists. Since it does, Java does not create a new literal but reuses the existing one.

The diagrams below show what happens when Java encounters string literals. The first diagram applies to Java upto JDK 6, the second one applies for subsequent versions.

There is only instance of “” created on first use. The subsequent usages do not create any new objects.

Why this Constant Pool Approach?

As you can see this way of storing string literals in a constant pool leads to reduced memory usage, especially when the literal is used often.

This way of conserving memory by putting an object in a shared area is inspired by the Flyweight Design Pattern. This is one of 23 Gang-of-Four Design Patterns. We will discuss flyweight, architectural and design patterns later in our lessons on object oriented design.

Using new Operator to Create String Objects

Sometimes, you want to have distinct strings and avoid sharing your string object with other portions of code. You can use new operator to create distinct objects. Consider the code below

In the above example bestWebsite and mostTrustedWebsite have the same value but they are pointing to different objects on heap.

Note that String class also has constructor that takes an array of characters. Therefore, the following are also valid forms

Having looked at what Java strings actually are under the hood we next turn to what makes them so special.

Leave a comment

Your email address will not be published. Required fields are marked *