Catalog of Good Practices#
This catalog has been prepared by gathering recommended good practices in code development. Additionally, measurements have been made with the JoularJX tool of the energy consumption of some well-known algorithms and data structures implemented in Java.
For each entry in the catalog, you will find the description of benchmarks, in some cases the specification of the tests we performed and bibliographical references.
If you are interested and want to reproduce the tests, all the Java programs and shell scripts used to perform the measurements are available in:
or, alternatively, you can copy the following command on your terminal:
wget https://www-inf.telecom-sudparis.eu/COURS/cen/Mesures/SoftwareEfficiencyCatalog.zip
Note
For more information on JoularJX:
Specifications of the testing PC#
OS |
Ubuntu 22.04 LTS 64 bits GNOME 42.2 |
---|---|
Model |
Dell Inc. Latitude 5580 |
CPU |
Intel® Core™ i7-7600U CPU @ 2.80GHz × 4 |
RAM |
32 Go |
Measurement tool |
JoularJX 1.0 |
Java Version |
openjdk 17.0.3 2022-04-19 |
Sorting#
The time complexity of a sorting algorithm can not be better than \(O(n \log(n))\).
Many sorting algorithms exist, we only present a few of them:
Slow sorts#
These algorithms are simple to implement but have a time complexity of \(O(n^2)\).
int[] |
ArrayList<Integer> |
ArrayList<String> |
|
---|---|---|---|
Size |
\(50\ 000\) |
\(50\ 000\) |
\(50\ 000\) |
Bubble sort |
\(68.89 \pm 3.06\ J\) |
\(265.28 \pm 9.01\ J\) |
\(593.44 \pm 11.12\ J\) |
Selection sort |
\(2.77 \pm 2.14\ J\) |
\(37.97 \pm 1.49\ J\) |
\(295.76 \pm 8.98\ J\) |
Insertion sort |
\(14.51 \pm 0.79\ J\) |
\(73.06 \pm 7.40\ J\) |
\(143.99 \pm 9.27\ J\) |
We can notice that the selection sort is here surprisingly better than the insertion sort to sort lists or arrays of integers.
Quick Sorts#
These algorithms have a time complexity of \(O(n\log(n))\). The sort()
method is a built-in method of Java. We did not code the Tim sort and the dual-pivot quicksort, but we show here which sort has been selected by Java developers.
Note that the Array.sort()
method uses a dual-pivot quicksort for primitive arrays (int[], bytes[], long[], etc) but a Tim sort for Object arrays.
int[] |
ArrayList<Integer> |
ArrayList<String> |
|
---|---|---|---|
Size |
\(3\ 000\ 000\) |
\(3\ 000\ 000\) |
\(3\ 000\ 000\) |
Merge sort |
\(11.33 \pm 0.65\ J\) |
\(42.18 \pm 2.21\ J\) |
\(59.08 \pm 2.46\ J\) |
(one-pivot) Quicksort |
\(6.38 \pm 0.38\ J\) |
\(23.72 \pm 1.00\ J\) |
\(49.78 \pm 3.48\ J\) |
Tim sort (Collections.sort()) |
|
\(22.73 \pm 1.88\ J\) |
\(37.10 \pm 6.54\ J\) |
Dual-pivot quicksort (Arrays.sort() for int[]) |
\(2.57 \pm 0.64\ J\) |
|
|
Conclusion#
It is acceptable to use a slow sort on a small list or to learn how to code, but it should never be used for bigger lists.
Moreover, if you know how to use a Comparator, you could simply use the implemented sort(Comparator) method.
Test specifications#
Click to see
The methods used are in the collections_comparison.ListSortingComparison.java and the collections_comparison.ArraySortingComparison.java classes.
The test was launched using the collections_comparison.TestListSortingComparison.java and the collections_comparison.TestArraySortingComparison.java classes with JoularJX by using the bash script and the files located in the folder scripts/test/testListSortingComparison/ and in scripts/test/testArraySortingComparison/ .
We launched 10 times the test class to get these means and standard deviations.
The integers used where randomly selected with the ThreadLocalRandom
class and its nextInt()
method.
We used this script to quickly launch the tests for the List
:
Click to see
#!/bin/bash
# Launches testListSortingComparison.sh
bash testListSortingComparison.sh 10 0 3000000 Integer # Quick sorts
bash testListSortingComparison.sh 10 1 50000 Integer # Slow sorts
bash testListSortingComparison.sh 10 0 3000000 String # Quick sorts
bash testListSortingComparison.sh 10 1 50000 String # Slow sorts
```
</details>
We used this script to quickly launch the tests for the `int[]`:
<details>
<summary>Click to see</summary>
```bash
#!/bin/bash
# Launches testArraySortingComparison.sh
bash testArraySortingComparison.sh 10 0 3000000 # Quick sorts
bash testArraySortingComparison.sh 10 1 50000 # Slow sorts
```
</details>
</details>
</details>
******
## SQL Queries
### Query Performance
The advices we can give you are really depending on the database you are using. General good practices are:
- If possible, use the `LIMIT` clause. The gain will be proportional to the quantity of retrieved data.
- The `SELECT *` should be avoided. The select should only contain the columns you need and no other one.
- Unnecessary `JOIN` also.
- To analyze the query, you can add the `EXPLAIN FORMAT=JSON` cause before the query. It will returned a table with one row containing a JSON column and its content is a representation of the query plan.
The cause to add depends on the SQL server used. For our server (MariaDB), we need to use `ANALYZE FORMAT=JSON`.
We give you a simple example of a JSON file with a query on our database.
The query is:
```sql
ANALYZE FORMAT=JSON
SELECT
Orders.Id
FROM
Orders
WHERE
ShippedDate IS NULL
```
and it gives us the JSON:
```JSON
{
"query_block": {
"select_id": 1,
"r_loops": 1,
"r_total_time_ms": 2.429,
"table": {
"table_name": "Orders",
"access_type": "ALL",
"r_loops": 1,
"rows": 6166,
"r_rows": 6110,
"r_total_time_ms": 2.1289,
"filtered": 100,
"r_filtered": 14.845,
"attached_condition": "Orders.ShippedDate is null"
}
}
}
```
- `r_loops` show the number of times the node was actually executed. Here the table `Orders` was only scanned once.
- `r_total_time_ms` show the time that was spent to execute the node. Here the query took 2.429 ms to be traited.
- `r_rows` show how many rows were actually read. Here 6110 rows of the `Orders` table were read. `rows` shows the number of rows expected to be read by the optimizer, but here, the optimizer did not have any way to estimate it.
- `r_filtered` show the percentage of rows that have been actually filtered with the `attached_condition`. In the same way as with `rows` and `r_rows`, `filtered` show what percentage was expected by the optimizer.
Other data can be obtained with the `EXPLAIN/ANALYZE` statement, you will then have to see the documentation in order to understand them.
### References
- **Les 115 bonnes pratiques**, 2022, collectif GreenIT. https://github.com/cnumr/best-practices. BP 75 "Optimiser les requêtes aux bases de données" - https://github.com/cnumr/best-practices/blob/main/chapters/BP_075_fr.md
- **MariaDB - ANALYZE and EXPLAIN Statements**, 2022, https://mariadb.com/kb/en/analyze-and-explain-statements/
*****
## String concatenation
<details>
<summary>Click to see</summary>
Be aware that to concatenate $n$ strings requires a time complexity of $O(n^2)$ because strings are immutable. Therefore, each concatenation results in the creation of a new string, which means poor performance.
Consider using the `StringBuilder` class when appending characters for example. Alternatively, use a character array, which is even better optimized.
The thread-safe version of the `StringBuilder` is the `StringBuffer`.
******
### Use comparison
````{div}
```{list-table}
:header-rows: 1
:name: java-strings
* -
- String
- StringBuilder
- StringBuffer
* - **Mutable**
- No
- Yes
- Yes
* - **Thread-safe**
- The String is immutable, therefore the object is thread-safe
- No
- Yes
* - **Append chars at the end**
- $488.73 \pm 28.39\ J$
- $0.32 \pm 0.14\ J$
- $0.33 \pm 0.27\ J$
* - **Append chars at the beginning**
- $219.64 \pm 23.91\ J$
- $47.50 \pm 2.83\ J$
- $42.74 \pm 4.54\ J$
```
````
******
### Test specifications
<details>
<summary>Click to see</summary>
The methods used are in the {{'[string_manipulation.StringManipulationComparison.java]({repo}/EfficaciteLogiciel/src/main/java/ecolog_tsp_uppa/string_manipulation/StringManipulationComparison.java)'.format(repo=urlweb)}} class.
The test was launched using the {{'[string_manipulation.TestStringManipulation.java]({repo}/EfficaciteLogiciel/src/main/test/ecolog_tsp_uppa/string_manipulation/TestStringManipulation.java)'.format(repo=urlweb)}} class with JoularJX by using the bash script and the files located in the folder {{'[scripts/test/testStringManipulation/]({repo}/EfficaciteLogiciel/scripts/test/testStringManipulation/)'.format(repo=urlweb)}}.
We launched 5 times the test class to get these means and standard deviations.
To get the results of the "Append chars" parts, we start with a empty String/StringBuilder/StringBuffer and we append 500 000 times the same character 'a'.
We used this script to quickly launch those tests:
<details>
<summary>Click to see</summary>
```bash
#!/bin/bash
# Launches testStringManipulation.sh with the 3 possible choices with size 500 000
bash testStringManipulation.sh 10 0 500000 # String
bash testStringManipulation.sh 10 1 500000 # StringBuilder
bash testStringManipulation.sh 10 2 500000 # StringBuffer
```
</details>
</details>
****
### References
* **Effective Java**. Third Edition. Joshua Bloch. "Item 64: Beware the performance of String concatenation".
</details>
****
## Java List
<details>
<summary>Click to see</summary>
### Choice of a list class
The `List` interface (and associated implementation classes) is one of the most used data structures for collections in Java. Java classes that implement this interface may be more consuming and less efficient than a Java `array` but it is much easier and safer to use.
When using a `List` in Java, you need to choose between different implementations. The main ones are the `ArrayList`, which is based on an array, and the `LinkedList`, which is based on a doubly linked list. Moreover, the `Vector`, which is based on the `ArrayList`, can be used for thread-safe usage. Here are the time complexities and energy consumptions resulting from a test of the main methods.
#### Main choices
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-data-structure
* -
- ArrayList
- LinkedList
- Vector
* - **Data structure**
- Based on an array
- Based on a doubly linked list
- Based on an array
* - **Thread-safe**
- No
- No
- Yes
```
````
For the memory use, the `LinkedList` nodes have to store the adresses of the previous and the next nodes in addition of the data.
On the other hand, if the reserved capacity for the `ArrayList` is higher than what it actually needs, then the remaining memory is wasted.
So, if we know the exact capacity to use and we reserve it at the creation of an `ArrayList`, then the `ArrayList` would take less memory than a `LinkedList` filled with the same data.
******
#### Benchmark 1
<details>
<summary>Click to expand for benchmark 1</summary>
The following results were obtained with Lists of 100 000 elements.
<h4> Insertion of elements </h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-insertion
* -
- ArrayList
- LinkedList
- Vector
* - **List.add(Object)**
- $O(1)$<br> (except if the ArrayList has to be extended: $O(n)$)
- $O(1)$
- *same as the ArrayList*
* - **List.add(int index, Object)**
- $O(n)$<br>($O(1)$ if adding at the end)
- $O(n)$<br>($O(1)$ if adding at the beginning or at the end)
- *same as the ArrayList*
* - **Insertion** at the beginning
- $10.605 \pm 0.421\ J$
- $0.73 \pm 0.22 \ J$
- $10.619 \pm 0.674\ J$
* - **Insertion** in the middle
- $4.494 \pm 0.286\ J$
- $\mathbf{564.60 \pm 28.95\ J}$
- $4.466 \pm 0.225\ J$
* - **Insertion** at the end
- $0.249 \pm 0.026\ J$
- $0.41 \pm 0.05\ J$
- $0.282 \pm 0.023\ J$
```
````
******
<h4> Remove elements </h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-remove
* -
- ArrayList
- LinkedList
- Vector
* - **List.remove(int index)**
- $O(n)$<br>($O(1)$ if removing at the end)
- $O(n)$<br>($O(1)$ if removing at the beginning or at the end)
- *same as the ArrayList*
* - **Remove** randomly half of the elements by using List.remove(int index)
- $3.587 \pm 0.226\ J$
- $\mathbf{211.21 \pm 24.93\ J}$
- $3.516 \pm 0.247\ J$
* - **Remove** elements whose index is even by using List.iterator() and Iterator.remove()
- $4.656 \pm 0.296\ J$
- $0\ J$
- $4.769 \pm 0.247\ J$
```
````
******
<h4>Accessing the elements</h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-access
* -
- ArrayList
- LinkedList
- Vector
* - **List.get(int index)**
- $O(1)$
- $O(n)$<br>($O(1)$ if getting at the beginning or at the end)
- *same as the ArrayList*
* - **Random access**
- $0.072 \pm 0.026\ J$
- $\mathbf{306.84 \pm 26.78\ J}$
- $0.134 \pm 0.034\ J$
* - **Iterate** by using List.get(int index)
- $0.035 \pm 0.017\ J$
- $\mathbf{295.52 \pm 18.27\ J}$
- $0.069 \pm 0.034\ J$
* - **Iterate** by using List.iterator() and Iterator.next()
- $0.047 \pm 0.015\ J$
- $0.17 \pm 0.00\ J$
- $0.061 \pm 0.025\ J$
* - **Iterate** by using the enhanced for-each loop
- $0.046 \pm 0.016\ J$
- $0.22 \pm 0.02\ J$
- $0.037 \pm 0.012\ J$
```
````
******
<h4> Sort the list</h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-sort
* -
- ArrayList
- LinkedList
- Vector
* - **Best sorting complexity**
- $O(n \log(n))$
- $O(n \log(n))$
- $O(n \log(n))$
* - **Sort** by using Collections.sort(List)
- $1.062 \pm 0.053\ J$
- $1.32 \pm 0.30\ J$
- $1.081 \pm 0.059\ J$
```
````
</details>
******
#### Benchmark 2
<details>
<summary>Click to expand for benchmark 2</summary>
The following results were obtained with Lists of 500 000 elements.
This time, we did not use the methods that require `List.get(int)` with the `LinkedList`. It will be written "-" in the table when no testing was done for the concerning method.
******
<h4> Insertion of elements</h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-insertion
* -
- ArrayList
- LinkedList
- Vector
* - **List.add(Object)**
- $O(1)$<br> (except if the ArrayList has to be extended: $O(n)$)
- $O(1)$
- *same as the ArrayList*
* - **List.add(int index, Object)**
- $O(n)$<br>($O(1)$ if adding at the end)
- $O(n)$<br>($O(1)$ if adding at the beginning or at the end)
- *same as the ArrayList*
* - **Insertion** at the beginning
- $\mathbf{450.40 \pm 24.04\ J}$
- $1.349 \pm 0.063 \ J$
- $\mathbf{456.33 \pm 16.97\ J}$
* - **Insertion** in the middle
- $163.92 \pm 7.74\ J$
- `-`
- $157.13 \pm 10.00 \ J$
* - **Insertion** at the end
- $1.19 \pm 0.69\ J$
- $1.259 \pm 0.041\ J$
- $1.32 \pm 0.17\ J$
```
````
******
<h4> Remove elements </h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-remove
* -
- ArrayList
- LinkedList
- Vector
* - **List.remove(int index)**
- $O(n)$<br>($O(1)$ if removing at the end)
- $O(n)$<br>($O(1)$ if removing at the beginning or at the end)
- *same as the ArrayList*
* - **Remove** randomly half of the elements by using List.remove(int index)
- $163.93 \pm 6.96\ J$
- `-`
- $165.89 \pm 3.22\ J$
* - **Remove** elements whose index is even by using List.iterator() and Iterator.remove()
- $205.24 \pm 6.60\ J$
- $0.257 \pm 0.333\ J$
- $165.89 \pm 3.22\ J$
```
````
******
<h4> Accessing the elements </h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-access
* -
- ArrayList
- LinkedList
- Vector
* - **List.get(int index)**
- $O(1)$
- $O(n)$<br>($O(1)$ if getting at the beginning or at the end)
- *same as the ArrayList*
* - **Random access**
- $0.96 \pm 0.14\ J$
- `-`
- $1.40 \pm 0.13\ J$
* - **Iterate** once by using List.get(int index)
- $0.29 \pm 0.01\ J$
- `-`
- $0.30 \pm 0.01\ J$
* - **Iterate** once by using List.iterator() and Iterator.next()
- $0\ J$
- $0.149 \pm 0.034\ J$
- $0.22 \pm 0.03\ J$
* - **Iterate** once by using the enhanced for-each loop
- $0.25 \pm 0.03\ J$
- $0.173 \pm 0.032\ J$
- $0.33 \pm 0.09\ J$
```
````
******
<h4> Sort the list </h4>
````{div}
```{list-table}
:header-rows: 1
:name: java-lists-sort
* -
- ArrayList
- LinkedList
- Vector
* - **Best sorting complexity**
- $O(n \log(n))$
- $O(n \log(n))$
- $O(n \log(n))$
* - **Sort** by using Collections.sort(List)
- $2.16 \pm 2.18\ J$
- $6.339 \pm 0.307\ J$
- $3.12 \pm 2.32\ J$
```
````
</details>
******
### Conclusion
The `LinkedList` is designed to be used as a list on which we iterate (by using an `Iterator` or by using the enhanced for-each loop) or to be used as a queue, that is working with the head or with the tail of the list.
For general purposes, the `ArrayList` seems to be a better choice. But if we have to work on elements that are near the head or the tail of the list, then the `LinkedList` would have better performance, otherwise, it has poor performance when it has to explore deeply the list. Moreover, if you need a thread-safe list, the `Vector`, which is based on the `ArrayList`, is a synchronized object you could use without losing too much performance.
******
### Test specifications
<details>
<summary>Click to see</summary>
The methods used are in the {{'[collections_comparison.ListMethodsComparison.java]({repo}/EfficaciteLogiciel/src/main/java/ecolog_tsp_uppa/collections_comparison/ListMethodsComparison.java)'.format(repo=urlweb)}} class.
The test was launched using the {{'[collections_comparison.TestListMethodsComparison.java]({repo}/EfficaciteLogiciel/src/test/java/ecolog_tsp_uppa/collections_comparison/TestListMethodsComparison.java)'.format(repo=urlweb)}} class with JoularJX by using the bash script and the files located in the folder {{'[scripts/test/testListMethodsComparison/]({repo}/EfficaciteLogiciel/scripts/test/testListMethodsComparison/)'.format(repo=urlweb)}}.
We write in the table the mean and the standard deviation obtained after launching the testing class 10 times to get a mean and a standard deviation of the methods' consumption and looping over the functions 10 times (so we divided the results by 10).
We used this script to quickly launch those tests:
<details>
<summary>Click to see</summary>
```bash
#!/bin/bash
### Benchmark 1 ###
# Launches testListMethodsComparison.sh with the 3 possible lists with size 100 000
bash testListMethodsComparison.sh 10 0 100000 1 10 # ArrayList # Use of List.get() and loop 10 times
bash testListMethodsComparison.sh 10 1 100000 # LinkedList
bash testListMethodsComparison.sh 10 2 100000 1 10 # Vector
### Benchmark 2 ###
# Launches testListMethodsComparison.sh with the 3 possible lists with size 500 000
bash testListMethodsComparison.sh 10 0 500000 # ArrayList
bash testListMethodsComparison.sh 10 1 500000 0 10 # LinkedList # No use of List.get() and loop 10 times
bash testListMethodsComparison.sh 10 2 500000 # Vector
The results were obtained using List
of strings. The strings were randomly created following the pattern “Number is %d”, where %d is a random integer. When the energy consumption is 0 J, it means that the execution was to quick for JoularJX to give a value for the concerned method.
When we iterate, we only iterate one time over the whole list.
For random access, we use List.get(int randomInt)
, where randomInt is a random integer, as many times as there are elements in the list, that is 100 000 times in our case.
For the second benchmark, with the LinkedList
, we have launched the methods 10 times in order for JoularJX to give us a power consumption higher than 0 J, so we divided by 10 the results we obtained.
References#
Effective Java. Third Edition. Joshua Bloch. “Item 28: Prefer
List
toArray
”.Energy profiles of Java collections classes. Samir Hasan, Zachary King, Munawar Hafiz, Mohammed Sayagh, Bram Adams, and Abram Hindle - 38th International Conference on Software Engineering (2016). pdf
Java™ Platform, Standard Edition 8 API Specification, (2022). Package Util. [online] Available at: https://docs.oracle.com/javase/8/docs/api/java/util/package-summary.html
Java Map#
Choice of the map class#
A Map
is an interface that maps keys to values.
When using a Map
in Java, you have to choose between differents implementations. The main ones are the HashMap
, the LinkedHashMap
and the TreeMap
. Moreover, the Hashtable
and the ConcurrentHashMap
can be used for thread-safe usage. These implementations differs in the data structures used. Here are the average time complexities and energy consumptions resulting from a test for the main methods.
Main choices#
HashMap |
LinkedHashMap |
TreeMap |
Hashtable |
ConcurrentHashMap |
|
---|---|---|---|---|---|
Data structure |
Hash table |
Hash table and linked lists (for keys, values and entries collections) |
Red-Black tree |
Hash table |
Hash table |
Order of collections |
No (random order) |
Order of insertion (as a LinkedList) |
Natural order |
No (random order) |
No (random order) |
Permits null as key |
Yes |
Yes |
No |
No |
No |
Thread-safe |
No |
No |
No |
Yes |
Yes |
The order of the TreeMap
is the natural order by default, but other orders can be used by precising the Comparator
when creating the TreeMap. Moreover, if this Comparator
allows null
, then null
can be used as a key.
Benchmark 1#
Click to expand for benchmark 1
Insertion of elements
HashMap |
LinkedHashMap |
TreeMap |
Hashtable |
ConcurrentHashMap |
|
---|---|---|---|---|---|
Map.put( Object key, Object value ) |
\(O(1)\) |
\(O(1)\) |
\(O(\log(n))\) |
\(O(1)\) |
\(O(1)\) |
Insertion |
\(393.97 \pm 8.65\ J\) |
\(444.42 \pm 10.33\ J\) |
\(\mathbf{918.49 \pm 19.11\ J}\) |
\(421.14 \pm 11.02\ J\) |
\(431.79 \pm 8.66\ J\) |
However, if we use a bad hashing function, the worst-case scenario for the hash tables is \(O(n)\) (as if we iterate over a List).
Remove elements
HashMap |
LinkedHashMap |
TreeMap |
Hashtable |
ConcurrentHashMap |
|
---|---|---|---|---|---|
Map.remove( Object key ) |
\(O(1)\) |
\(O(1)\) |
\(O(\log(n))\) |
\(O(1)\) |
\(O(1)\) |
Remove half the elements |
\(13.61 \pm 6.12\ J\) |
\(28.87 \pm 6.68\ J\) |
\(66.67 \pm 8.91\ J\) |
\(15.66 \pm 6.41\ J\) |
\(17.70 \pm 4.20\ J\) |
Access
HashMap |
LinkedHashMap |
TreeMap |
Hashtable |
ConcurrentHashMap |
|
---|---|---|---|---|---|
Map.get( Object key ) |
\(O(1)\) |
\(O(1)\) |
\(O(\log(n))\) |
\(O(1)\) |
\(O(1)\) |
Random access |
\(37.40 \pm 0.81\ J\) |
\(44.00 \pm 2.54\ J\) |
\(296.74 \pm 20.49\ J\) |
\(42.49 \pm 2.73\ J\) |
\(39.37 \pm 2.75\ J\) |
Use Map.get() for each key |
\(33.77 \pm 1.88\ J\) |
\(39.85 \pm 2.75\ J\) |
\(94.39 \pm 7.31\ J\) |
\(27.93 \pm 3.21\ J\) |
\(23.92 \pm 1.49\ J\) |
Iterate over the Map.entrySet() with the Set.iterator() |
\(24.32 \pm 3.92\ J\) |
\(8.01 \pm 1.88\ J\) |
\(42.99 \pm 10.61\ J\) |
\(22.87 \pm 4.04\ J\) |
\(27.74 \pm 4.42\ J\) |
Iterate over the Map.entrySet() with a for-each loop |
\(19.24 \pm 2.95\ J\) |
\(6.28 \pm 1.53\ J\) |
\(39.40 \pm 6.49\ J\) |
\(16.20 \pm 0.94\ J\) |
\(17.27 \pm 3.58\ J\) |
Conclusion#
The choice really depends on the usage.
If the order of elements does not matter, then do not take the TreeMap
, which has the worse performance overall, supposing the hash function used for the hash table is good enough.
For thread-safe usage, our results show a little difference between the Hashtable
and the ConcurrentHashMap
. Due to their way of dealing with read-write locks, the ConcurrentHashMap
was supposed to be overall better than the Hashtable
but their consumption are pretty close.
For general purposes, take the HashMap
. TreeMap
has the worst performance for a Map
implementation and should be used only if you know what you are doing.
Moreover, if you want to insert and keep elements in a particular order, then you might think of using the LinkedHashMap
. This latter is also useful for copying a Map
while keeping its order.
Test specifications#
Click to see
The methods used are in the collections_comparison.MapMethodsComparison.java class.
The test was launched using the collections_comparison.TestMapMethodsComparison.java class with JoularJX by using the bash script and the files located in the folder scripts/test/testMapMethodsComparison/.
We launched 10 times the test class to get these means and standard deviations. The results were obtained using 20 000 000 entries where the key was a random integer and the value was a string, which was created following the pattern “Number is %d”, where %d is an other random integer. When we iterate, we only iterate one time over the whole list.
We used this script to quickly launch those tests:
Click to see
#!/bin/bash
# Launches testMapMethodsComparison.sh
bash testMapMethodsComparison.sh 10 0 20000000 # HashMap
bash testMapMethodsComparison.sh 10 1 20000000 # LinkedHashMap
bash testMapMethodsComparison.sh 10 2 20000000 # TreeMap
bash testMapMethodsComparison.sh 10 3 20000000 # Hashtable
bash testMapMethodsComparison.sh 10 4 20000000 # ConcurrentHashMap
For random access, we use Map.get(int randomInt)
, where randomInt
is a random integer, as many times as there are elements in the list, that is 5 000 000 times in our case.
To remove half of the elements, we use the Map.entrySet.iterator()
and we remove one out two objects.
References#
Energy profiles of Java collections classes. Samir Hasan, Zachary King, Munawar Hafiz, Mohammed Sayagh, Bram Adams, and Abram Hindle - 38th International Conference on Software Engineering (2016). pdf
Java™ Platform, Standard Edition 8 API Specification, (2022). Package Util. [online] Available at: https://docs.oracle.com/javase/8/docs/api/java/util/package-summary.html
Java Set#
Choice of the set class#
In Java, the main implementations of the Set
, which is a Collection
whose elements are unique, are the HashSet
, the LinkedHashSet
and the TreeSet
. They are based on their Map
counterpart where the keys would be the elements of the set and the values would all be null
.
Main choices#
HashSet |
LinkedHashSet |
TreeSet |
|
---|---|---|---|
Data structure |
Hash table |
Hash table and linked list |
Red-Black tree |
Order of collections |
No (random order) |
Order of insertion (as a LinkedList) |
Natural order |
Permits null as key |
Yes |
Yes |
No, except if using a comparator that allow |
Thread-safe |
No |
No |
No |
Benchmark 1#
Click to expand for benchmark 1
Insertion of elements
HashSet |
LinkedHashSet |
TreeSet |
|
---|---|---|---|
Set.add(Object) |
\(O(1)\) |
\(O(1)\) |
\(O(\log(n))\) |
Insertion |
\(86.04 \pm 1.39\ J\) |
\(96.92 \pm 1.08\ J\) |
\(186.42 \pm 4.65\ J\) |
However, if we use a bad hashing function, the worst-case scenario for the hash tables is \(O(n)\) (as if we iterate over a List).
Remove elements
HashSet |
LinkedHashSet |
TreeSet |
|
---|---|---|---|
Set.remove( Object key ) |
\(O(1)\) |
\(O(1)\) |
\(O(\log(n))\) |
Remove half the elements |
\(15.47 \pm 0.75\ J\) |
\(15.97 \pm 0.16\ J\) |
\(5.73 \pm 0.81\ J\) |
Access
HashSet |
LinkedHashSet |
TreeSet |
|
---|---|---|---|
Set.contains( Object key ) |
\(O(1)\) |
\(O(1)\) |
\(O(\log(n))\) |
Check if objects belong to the Set |
\(12.87 \pm 7.74\ J\) |
\(26.12 \pm 1.25\ J\) |
\(106.58 \pm 3.89\ J\) |
Iterate with the Set.iterator() |
\(4.32 \pm 0.42\ J\) |
\(1.52 \pm 0.02\ J\) |
\(6.65 \pm 0.72\ J\) |
The order of the TreeSet
is the natural order by default, but other orders can be used by precising the Comparator
when creating the TreeSet
. Moreover, if this Comparator
allows null
, then null
can be used as a key.
Conclusion#
We have the exact same conclusion than with the Map
implementations:
For general purposes,
HashSet
To keep the order of insertion,
LinkedHashSet
To keep the elements sorted,
TreeSet
And you should consider using a LinkedHashSet
while inserting yourself the elements in the order you want instead of using a TreeSet
.
Test specifications#
Click to see
The methods used are in the collections_comparison.SetMethodsComparison.java class.
The test was launched using the collections_comparison.TestSetMethodsComparison.java class with JoularJX by using the bash script and the files located in the folder scripts/test/testSetMethodsComparison/.
We launched 5 times the test class to get these means and standard deviations. The results were obtained using 5 000 000 strings which were created following the pattern “Number is %d”, where %d is an other random integer. When we iterate, we only iterate one time over the whole list.
We used this script to quickly launch those tests:
Click to see
#!/bin/bash
# Launches testSetMethodsComparison.sh
bash testSetMethodsComparison.sh 5 0 5000000 # HashSet
bash testSetMethodsComparison.sh 5 1 5000000 # LinkedHashSet
bash testSetMethodsComparison.sh 5 2 5000000 # TreeSet
To check if some objects belonged to the Set
, we use Set.contains(String s)
, where s
was created following the pattern “Number is %d”, where %d is an other random integer as many times as there are elements in the list, that is 5 000 000 times in our case.
To remove half of the elements, we use the Set.iterator()
and we remove one out of two objects.
References#
Energy profiles of Java collections classes. Samir Hasan, Zachary King, Munawar Hafiz, Mohammed Sayagh, Bram Adams, and Abram Hindle - 38th International Conference on Software Engineering (2016). pdf
Java™ Platform, Standard Edition 8 API Specification, (2022). Package Util. [online] Available at: https://docs.oracle.com/javase/8/docs/api/java/util/package-summary.html
Looping over an array or an ArrayList#
It seems that depending on the device you use, especially for the devices without Just-In-Time (JIT) compiler, like the first versions of Android and some IoT devices, some practises allow the application to be a little faster and less consuming.
Depending on the class of the array, it seems to be better to store the length of an array when we iterate over it, because without a JIT, the Java Virtual Machine (JVM) will have to retrieve the value of the length at each iteration.
It should not affect a computer or any device with a JIT, but we check it here and check if other loop structures could be used or not.
For the following benchmarks:
The “!=” versions are the use of the !=
comparator instead of the <
comparator for the loop with array.length
.
The definitions of our loops:
Base iteration
Object[] array = createMyArray(); // or ArrayList<Object> al = createMyArrayList()
for (int j = 0; j < numberOfLoops; j++) {
for (int i = 0; i < array.length; i++) { // or i < al.size()
// Doing stuff
}
}
Recommended iteration
Object[] array = createMyArray();
int n = array.length
for (int j = 0; j < numberOfLoops; j++) {
for (int i = 0; i < n; i++) {
// Doing stuff
}
}
Descending order
Object[] array = createMyArray();
for (int j = 0; j < numberOfLoops; j++) {
for (int i = array.length - 1; i > -1; i--) {
// Doing stuff
}
}
Benchmark 1#
Click to expand for benchmark 1
The int[]
array were filled of 250 zeros and we looped over it 1 000 000 000 times.
For the ArrayList<Object>
and the Object[]
array, they were filled of 250 null and we looped over them 100 000 000 times.
We launched 10 times the test class to get these means and standard deviations. We have divided the result of the int[]
by 10 because there were 10 times more loops and it gives a better view to compare the different consumption between the data structure.
int[] |
Object[] |
ArrayList<Object> |
|
---|---|---|---|
Base iteration |
\(33.859 \pm 1.031\ J\) |
\(206.85 \pm 11.72\ J\) |
\(222.02 \pm 7.00\ J\) |
Base iteration “!=” |
\(319.562 \pm 10.887\ J\) |
\(363.97 \pm 40.28\ J\) |
\(542.67 \pm 48.36\ J\) |
Recommended iteration |
\(\mathbf{29.630 \pm 1.048\ J}\) |
\(491.82 \pm 15.55\ J\) |
\(220.49 \pm 10.57\ J\) |
Recommended iteration “!=” |
\(319.562 \pm 10.887\ J\) |
\(368.17 \pm 20.98\ J\) |
\(500.41 \pm 18.42\ J\) |
Descending order iteration |
\(39.016 \pm 1.440\ J\) |
\(\mathbf{181.72 \pm 14.75\ J}\) |
\(\mathbf{178.07 \pm 10.33\ J}\) |
Benchmark 2#
Click to expand for benchmark 2
The int[]
array were filled of 250 000 zeros and we looped over it 100 000 times.
For the ArrayList<Object>
and the Object[]
array, they were filled of 250 000 null and we looped over them 100 000 times.
We launched 10 times the test class to get these means and standard deviations.
int[] |
Object[] |
ArrayList<Object> |
|
---|---|---|---|
Base iteration |
\(\mathbf{61.81 \pm 4.06\ J}\) |
\(\mathbf{208.89 \pm 6.32\ J}\) |
\(231.58 \pm 7.88\ J\) |
Base iteration “!=” |
\(337.73 \pm 12.71\ J\) |
\(354.30 \pm 9.90\ J\) |
\(523.77 \pm 55.55\ J\) |
Recommended iteration |
\(\mathbf{62.86 \pm 2.96\ J}\) |
\(341.34 \pm 10.12\ J\) |
\(272.26 \pm 11.12\ J\) |
Recommended iteration “!=” |
\(266.58 \pm 12.22\ J\) |
\(408.28 \pm 18.84\ J\) |
\(555.32 \pm 55.55\ J\) |
Descending order iteration |
\(\mathbf{60.92 \pm 4.31\ J}\) |
\(233.45 \pm 9.37\ J\) |
\(\mathbf{194.29 \pm 11.55\ J}\) |
Conclusion#
The
!=
comparator should be avoided if the>
or the<
comparator can be used instead.The so-called “recommended structure” is not a good name.
Surprisingly, the better structure for an
ArrayList
seems to be in a descending order.
Test specifications#
Click to see
The methods used are in the collections_comparison.IterationComparison.java class.
The test was launched using the collections_comparison.TestIterationComparison.java class with JoularJX by using the bash script and the files located in the folder scripts/test/testIterationComparison/.
We used this script to quickly launch those tests:
Click to see
#!/bin/bash
# Launches testIterationComparison.sh with the 3 possible arrays with size 250 and number of loops = 100 000 000 or 1 000 000 000
bash testIterationComparison.sh 10 0 250 100000000 # ArrayList<Object>
bash testIterationComparison.sh 10 1 250 100000000 # Object[]
bash testIterationComparison.sh 10 2 250 1000000000 # int[]
# Launches testIterationComparison.sh with the 3 possible arrays with size 250 000 and number of loops = 100 000
bash testIterationComparison.sh 10 0 250000 100000 # ArrayList<Object>
bash testIterationComparison.sh 10 1 250000 100000 # Object[]
bash testIterationComparison.sh 10 2 250000 100000 # int[]
References#
Performance Tips | Android Developpers. “Use Enhanced For Loop Syntax” part.[online] Available at: https://android-doc.github.io/training/articles/perf-tips.html#Loops
An investigation into energy-saving programming practices for android smartphone app development, D. Li, W. G. J. Halfond, in: Proceedings of the 3rd International Workshop on Green and Sustainable Software, GREENS 2014, ACM, New York, NY, USA, 2014, pp. 46–53. doi:10.1145/2593743.2593750. URL http://doi.acm.org/10.1145/2593743.2593750