php – MySQL: Alternative to JSON for older versions/Storing & querying multiple values in one column-ThrowExceptions

Exception or error:

I’m writing a PHP package whereby I need to store a set of “documents” each with their own attributes which can vary in quantity, name and type, just like the attributes for different types of products could differ (E.g. a shoe may have a material, color and style but a smartphone may have an operating system, weight, size etc.)

| id  | name       |
|-----|------------|
| 1   | Acme Shoe  |
| 2   | Acme Phone |

I want to be able to query all of my documents, or products by their attributes. The queries could range from a very simple WHERE attribute_a = value_a to a much more complicated nested set of clauses, like WHERE ((attribute_a = value_a OR attribute_a > value_b) AND attribute_b LIKE '%pattern%')

My ideal scenario would be to use the native JSON support afforded by MySQL 5.7+ and MariaDB 10.2+ to store the attributes against each document and use the handy JSON_EXTRACT function to extract any attribute that I want to query.

| id  | name       | attributes                             |
|-----|------------|----------------------------------------|
| 1   | Acme Shoe  | {"material":"canvas","color":"black"}  |
| 2   | Acme Phone | {"os":"android","weight":100}          |


SELECT *
FROM documents
WHERE (
    JSON_EXTRACT(attributes, "$.weight") = 1
    OR JSON_EXTRACT(attributes, "$.weight") > 99
)
AND JSON_EXTRACT(attributes, "$.os") LIKE '%droid%'

Unfortunately, my package needs to be able to support older versions of MySQL and MariaDB. I had considered storing JSON in a TEXT or LONGTEXT field and using REGEX to parse out the values of the attributes I need when making comparisons but I can imagine that would be incredibly resource intensive and slow. Please correct me if I’m wrong.

So as it stands, I feel like I’m locked into going for an EAV type solution:

| id  | name       |
|-----|------------|
| 1   | Acme Shoe  |
| 2   | Acme Phone |


| id  | document_id | key      | value   |
|-----|-------------|----------|---------|
| 1   | 1           | material | canvas  |
| 2   | 1           | color    | black   |
| 3   | 2           | os       | android |
| 4   | 2           | weight   | 100     |

Finding the documents with one WHERE clause is relatively trivial:

SELECT DISTINCT(document_id)
FROM document_attributes
WHERE key = 'material'
AND value = 'canvas'

However, I have no idea how I would implement more complicated WHERE clauses. Particularly, the problem being that the attributes are stored in separate rows. E.g.

  • Getting the documents that have canvas material AND are colored black.
  • Getting the documents that have android os AND have weight either, 1 or greater than 99.

Any advice or recommendations would be greatly appreciated.


Edit

After some consideration with the EAV approach, the best I have managed to come up with so far is repeatedly joining the attributes table to the documents table for each attribute involved in the query. From there, I’m able to use each attribute’s value in the WHERE clause. For example, selecting all products where the attribute “material” is “canvas”, OR the “weight” is greater than 99:

SELECT d.id AS id, a1.value AS material, a2.value AS weight
FROM documents AS d
LEFT JOIN attributes AS a1 ON a1.document_id = d.id AND a1.name = 'material'
LEFT JOIN attributes AS a2 ON a2.document_id = d.id AND a2.name = 'weight'
WHERE a1.value = 'canvas'
AND a2.value > 99

This appears to yield:

| id | material | weight |
|----|----------|--------|
| 1  | canvas   | NULL   |
| 2  | NULL     | 100    |
How to solve:

Assuming the document_id/key/value combination is unique, you could do something like this:

SELECT document_id FROM example
WHERE `key`='material' AND `value`='canvas'
OR    `key`='color' AND `value`='black'
GROUP BY document_id
HAVING COUNT(*) = 2;

SELECT document_id FROM example
WHERE `key`='os' AND `value`='android'
OR    (`key`='weight' AND (`value` = 1) OR (`value` > 99))
GROUP BY document_id
HAVING COUNT(*) = 2;

Answer´╝Ü

Try this SQL:

select SUBSTRING_INDEX( SUBSTRING_INDEX(attributes,'"',4) ,'"',-1) from documents;

Leave a Reply

Your email address will not be published. Required fields are marked *