INF 4401 – Introduction To Databases

Portail informatique

CI6 : NoSQL

Understand NoSQL databases.

ACID (∼30min – medium)

Explain what can happen if the following transaction is not atomic:
  1. Check if a flight is available.
  2. If it is available, book a seat in the plane.
  3. Remove the money of the client from the bank account
If the transaction is not atomic, it means it can stop anywhere without anything special happening. Here, if the transaction stops after the second step, it means the seat is booked but the client did not pay!

Explain why the following transaction can create consistancy problems in a database where all students' grades must be above 0.
  1. Check if a student cheated.
  2. If it is the case, remove 5 points on the final grade.
Here, if the final grade is less that 5, then the grade can become negative, which is impossible!

Explain why isolation is important with the following two transactions. You can use a drawing to explain the problem.
First transaction:
  1. PersonA checks if a Harry Potter is available at the library.
  2. If it is available, change the status of the book to non-available
Second transaction:
  1. PersonB checks if a Harry Potter is available at the library.
  2. If it is available, change the status of the book to non-available
If the two transactions happen at the same time, then PersonA and PersonB might both think the book is available and both reserve it.

Solution

Explain what would be the problem if the database of the financial system is not durable and crashes after the following transaction:
  1. PersonA asks to buy a steam game activation code to PersonB
  2. PersonB agrees
  3. PersonA sends money to PersonB
  4. PersonB sends the game code to PersonA
If the financial database crashes, then the money PersonB receive might disappear!

NoSQL - Key Value (∼20min – medium)

In this exercice, we want to use a key-value NoSQL database.

You are working for the social security and they want to build a quick system to get the phone number of everybody. You decide to use a key-value database. What would be the key? The value?
The key can be the social security number and the value the phone number.
Next, we want to adapt our wine database to a key-value database. To do so, we create several collections. Here is a sample of the collections giving the producer of each harvest and the harvest of origin of a wine.

Solution

How can you get the producer of a wine?
To get the producer of a wine, you need to first find the harvest of this wine and then the producer of this harvest.
With this example, you can see that it can become very complex to gather information coming from many collections.

Can you get the wines produced by a given producer?
No because the producer is not a key of a table. Besides, if we create a table linking producers to harvests, we would have a problem because the harvest is not unique.

We want to integrate the information about the vineyard of each wine. What would be the key? The value?
The key is the WineID and the value the vineyard.

NoSQL - Document (∼20min – medium)

In this exercice, we focus on Document NoSQL databases. Here, we adapt our example of the wine database. We want to represent the wines with documents of the following structure in JSON:
wine = { "wineID": "W12", "vineyard": "Chinon", "year": 2015, "degree": 13.5, "harvests": [ { "harvestID": "H87", "producer": { "producerID": "P15", "name": "Jean Lecoude", "city": "Avoine" }, "weight": 12.5 }, { "producer": { "producerID": "P42", "name": "John Louis", "city": "Chinon" }, "weight": 12.5 } ], "orders": [ { "orderID": "O789", "date": "27-12-2022", "client": { "clientID": "C46", "name": "Brutus", "city": "Lyon" }, "quantity": 15 }, { "orderID": "O1245", "date": "30-01-2023", "client": { "clientID": "C869", "name": "Cesar", "city": "Rome" }, "quantity": 30 }, { "orderID": "O2589", "date": "12-02-2023", "client": { "clientID": "C46", "name": "Brutus", "city": "Lyon" }, "quantity": 2 } ] }

What is the value of wine["harvests"][0]["producer"]["city"]?
Avoine

How can you access the value "Cesar" in the above document?
wine["orders"][1]["client"]["name"]

If all the wines are stored in documents like the one presented above, what would be the problem is a client change his or her city?
If a client changes his or her city, then we would need to read all the wines, check if the client bought this wine, and if it is the case, change the city. This can be long and create many mistakes. The problem is that some information is redundant, and therefore hard to update.

Following the example above, encode the following schema into one document for each student:
  • Student(studentID: String, name: String, birthdate: Date)
  • advisor(studentID: String, teacherID: String)
  • Teacher(teacherID: String, name: String, salary: Float)
  • attend(studentID: String, sessionID: String, grade: Integer)
  • Session(sessionID: String, semester: Integer, year: Integer)
For the values, you can use what you want. If you have a list, you need to create more than one entry.
{ "studentID": "S486", "name": "Albert Einstein", "birthdate": "14/03/1879", "advisor" : { "teacherID": "T4855", "name": "Max Planck", "salary": 4258 }, "attend": [ { "session": { "sessionID": "Se45458", "semester": 2, "year": 2022 }, "grade": 18 }, { "session": { "sessionID": "Se945665", "semester": 1, "year": 2023 }, "grade": 14 } ] }

NoSQL - Graph (∼20min – medium)

In this exercice, we will focus on graph NoSQL Databases.

We decide to represent our wine database with a graph. Here is a sample of this database:

Solution

What are the nodes in this graph? What do they represent?
The graph contains 8 nodes that represent wines, producers, and clients. The wines are W1, W2, W3, the producers are P1, P2, the clients are C1, C2, C3.

How many edges does this graph contain? What do they represent?
There are 8 edges. An edge between a producer and a wine represents that the producer harvested for this wine. An edge between a client and a wine means the client bought this wine.

Construct a graph that could be used to represent (partially) the following schema. You graph need to contain at least 2 teachers, 2 students, and 3 sessions.
  • Student(studentID: String, name: String, birthdate: Date)
  • advisor(studentID: String, teacherID: String)
  • Teacher(teacherID: String, name: String, salary: Float)
  • attend(studentID: String, sessionID: String, grade: Integer)
  • Session(sessionID: String, semester: Integer, year: Integer)
Explain what the nodes and edges represent.
Solution

In this graph, we have nodes for students, teachers, and sessions. An edge between a teacher and a student means the teacher supervizes the student. An edge between a student and a session means the student attends the session.