Current location - Quotes Website - Signature design - How to design an efficient database for article classification and labeling?
How to design an efficient database for article classification and labeling?

In almost all web projects, article classification and label design are involved. It should be said that this is a relatively common and typical case.

The webmaster does not guarantee that my ideas are the best, but just share them for mutual exchange and mutual promotion and improvement.

The development project we assume is a blog system, and the core part is related to articles. So today we discuss how to design the article classification and labeling of blog system

1. First of all, classification and labeling are related to specific articles. Of course, there may be some articles that have neither classification nor labeling, which is something that people tend to neglect when writing queries < P > Because our first feeling is to associate the classification table when querying the article list and find out all the articles and classifications. Correspondence is generally the id of the article table corresponding to the id of the classification table. There is a problem in using the where clause to define it. Because of the where clause, only articles with classification can be queried, but articles without classification can't be queried. < P > What should I do at this time? You should use the join query, left join, which will show null

in the column of article classification id. Usually, we only use left join, but rarely use right join

2. Generally, it is best for an article to correspond to only one classification. Of course, if you want to correspond to multiple classifications, you can

but the webmaster does not advocate it. Repetition of articles in multiple categories will give people a very unprofessional feeling. Even though some articles may indeed be designed in many aspects, you should classify them according to their key points

and the labels are different. An article may have multiple labels

, which means that we can't find out the classification and labels of all articles by one sql statement. And make sure that the article id in the query results is not repeated

Usually we need to loop out the query results directly, so this result is generally a two-dimensional array, and the second one stores the relevant information of the only article

However, the relationship between tags and articles is many-to-one, and multiple tags correspond to an article. If you only use one sql statement, then the results we query are of course multi-lines. This does not meet the requirements of our target data.

Therefore, we need to query the article labels on the basis of the previous results after querying the articles and classification, and combine the two results and store them in an array. This is the query method for the corresponding article list page.

For the specific article page, we can query it twice.

Well, before giving the specific database design, let's talk about how to query the results first. I believe everyone can read it. Here are some examples: 1. Article table: post, with the following fields: id unique identification, aid author id, title title, content content, cid classification id 2. Classification table, category, with the following fields: id unique identification, associated with cid of post table, name classification name 3. tag table, with the following fields: id unique identification, Name tag name IV. tag_, the fields are as follows: id unique identification, postid article id, id associated with post table, tagid tag id, id associated with tag table. Some friends may ask: Why do you want to use a separate table to store the corresponding relationship between articles and tags? Why can't you directly add an article id field to the tag table, such as: tag table: id, Postid, name is not impossible to do this, but because an article corresponds to multiple tags, the value of the name field will be repeated. For example, if an article has an id of 1 and two tags, php and mysql, it will be stored in the tag table like this: ID: 1, PostID: 1, Name: PHP ID 2, PostID: 1, Name:mysql another article, assuming that the id is 2 and there are two tags, which are also php and mysql, will be stored in the tag table like this: ID: 3, PostID: 2, Name: PHP ID 4, PostID: 2, Name: MySQL. Everyone soon found the problem. This design name field, that is, the name of the tag, may be duplicated a lot in the same table

. However, the advantage of this design is that if you want to find out how many articles are under a tag, you only need to look up this table separately. For example, if you want to find out how many articles contain php tags, you only need to select count (name) 2f ROM tag where name =' PHP'. It can be found out that

the disadvantage is that if you want to query the set of all tags, you need to use the group by name statement to remove the duplicate rows.

If you use the previous one, you only need to select * from tag.

For a while, it seems that it is not easy to choose.

Both designs will have data mounds, and the tagid field in the first tag_ table will be duplicated. And both of these designs have their own advantages

So how should we choose? The webmaster can't say it well, so I can't draw a conclusion for everyone

But when studying the data structure of wordpress, the webmaster found that wp uses a separate table to store the corresponding relationship between articles and tags

In addition, how to design it sometimes depends on the requirements of specific functions, so this issue is left for everyone to discuss ~ Tags: classification and tags, blog database design.