Current location - Quotes Website - Personality signature - In-depth understanding of how HTTPS works
In-depth understanding of how HTTPS works
In recent years, the Internet has undergone earth-shaking changes, especially the HTTP protocol that we have always taken for granted has been gradually replaced by the HTTPS protocol. With the promotion of browsers, search engines, CA institutions and large Internet companies, the Internet has ushered in the "HTTPS encryption era", and HTTPS will completely replace HTTP as the mainstream of transmission protocols in the next few years.

After reading this article, I hope you can understand:

What are the problems in HTTP communication?

How can HTTPS improve the problems in HTTP?

How does HTTPS work?

What is HTTPS?

HTTPS is a secure version of HTTP protocol, which establishes SSL encryption layer on HTTP and encrypts the transmitted data. Now it is widely used in security-sensitive communication on the World Wide Web, such as transaction payment.

the main functions of https are:

(1) to encrypt data and establish an information security channel to ensure data security during transmission;

(2) Authenticate the website server.

we often use HTTPS communication on the login page and shopping settlement interface of the Web. When using HTTPS communication, instead of using http://, use https://. In addition, when the browser visits a Web site with effective HTTPS communication, a locked mark will appear in the address bar of the browser. The display mode of HTTPS will change with different browsers.

?

why do you need HTTPS

there may be security problems such as information theft or identity camouflage in the HTTP protocol. The use of HTTPS communication mechanism can effectively prevent these problems. Next, let's understand what problems exist in the

HTTP protocol:

The communication uses plaintext (not encrypted), and the content may be eavesdropped

Because HTTP itself does not have the encryption function, it is impossible to encrypt the whole communication (the content of requests and responses communicated using HTTP protocol). That is, HTTP messages are sent in plain text (referring to unencrypted messages).

The defects of HTTP plain text protocol are important reasons for security problems such as data leakage, data tampering, traffic hijacking and phishing attacks. HTTP protocol can't encrypt data, and all communication data is streaked in plain text in the network. The content of HTTP message can be restored by network sniffer equipment and some technical means.

the integrity of the message cannot be proved, so it may be tampered with

The so-called integrity refers to the accuracy of the information. Failure to prove its completeness usually means that it is impossible to judge whether the information is accurate or not. Because the HTTP protocol can't prove the message integrity of communication, even if the content of the request or response is tampered with, there is no way to know it from the time after the request or response is sent to the time before the other party receives it. In other words, there is no way to confirm that the sent request/response and the received request/response are the same.

the identity of the communication party is not verified, so it may be disguised

The request and response in the HTTP protocol do not confirm the communication party. When communicating in HTTP protocol, anyone can initiate a request because there is no process to confirm the communicating party. In addition, as long as the server receives the request, no matter who the other party is, it will return a response (but only on the premise that the IP address and port number of the sender are not restricted by the Web server)

HTTP protocol can't verify the identity of the communication party, and anyone can fake a fake server to deceive the user, thus achieving "phishing fraud" that the user can't detect.

On the other hand, the HTTPS protocol has the following advantages compared with the HTTP protocol (which will be described in detail below):

Data privacy: the content is symmetrically encrypted, and each connection generates a unique encryption key

Data integrity: the content transmission is integrity checked

Identity authentication: the third party cannot forge the server (client) identity

Third, how does HTTPS solve the above problems of HTTP?

HTTPS is not a new protocol of the application layer. Only the HTTP communication interface is replaced by SSL and TLS protocols.

generally, HTTP communicates directly with TCP. When SSL is used, it evolves to communicate with SSL first, and then with SSL and TCP. In short, the so-called HTTPS is actually HTTP in the shell of SSL protocol.

?

after SSL is adopted, HTTP has the functions of encryption, certificate and integrity protection of HTTPS. That is to say, HTTP plus encryption, authentication and integrity protection is HTTPS.

?

the main functions of https protocol basically depend on TLS/SSL protocol, and the realization of TLS/SSL mainly depends on three basic algorithms: hash function, symmetric encryption and asymmetric encryption, which use asymmetric encryption to realize identity authentication and key negotiation. Symmetric encryption algorithm encrypts data with negotiated key, and verifies the integrity of information based on hash function.

?

solving the problem that the content may be eavesdropped-encryption

Method 1. Symmetric encryption

This method uses the same key for encryption and decryption. Both encryption and decryption use keys. Without the key, you can't decrypt the password. On the other hand, anyone who has the key can decrypt it.

when encrypting with symmetric encryption, you must also send the key to the other party. But how can we hand it over safely? When forwarding the key on the Internet, if the communication is intercepted, the key will fall into the hands of attackers, and at the same time it will lose the meaning of encryption. In addition, we must try to keep the received key safely.

method 2. asymmetric encryption

public key encryption uses a pair of asymmetric keys. One is called a private key, and the other is called a public key. As the name implies, private keys cannot be known to anyone else, while public keys can be released at will and can be obtained by anyone.

using public key encryption, the sender uses the other party's public key for encryption, and the other party uses its own private key for decryption after receiving the encrypted information. In this way, there is no need to send the private key for decryption, and there is no need to worry that the key will be eavesdropped and stolen by attackers.

?

asymmetric encryption is characterized by one-to-many information transmission, and the server only needs to maintain a private key to perform encrypted communication with multiple clients.

s This method has the following disadvantages:

The public key is public, so hackers can use the public key to decrypt the information encrypted by the private key after intercepting it, so as to obtain its contents;

The public key does not contain the information of the server. The use of asymmetric encryption algorithm cannot ensure the legitimacy of the server identity, and there is a risk of man-in-the-middle attack. The public key sent by the server to the client may be intercepted and tampered with by the man-in-the-middle during transmission.

It takes some time to encrypt and decrypt data by using asymmetric encryption, which reduces the efficiency of data transmission;

method 3. symmetric encryption+asymmetric encryption (HTTPS adopts this method)

the advantage of using symmetric key is that the decryption efficiency is faster, and the advantage of using asymmetric key is that the transmitted content cannot be cracked, because even if you intercept the data, you can't crack the content without the corresponding private key. For example, you grab a safe, but you can't open it without the key to it. Then we will combine symmetric encryption with asymmetric encryption, make full use of their respective advantages, use asymmetric encryption in the key exchange link, and then use symmetric encryption in the stage of establishing communication exchange messages.

The specific method is: the sender uses the other party's public key to encrypt the "symmetric key", and then the other party decrypts it with its own private key to obtain the "symmetric key", so as to ensure the security of the exchanged keys, and communicate by using the symmetric encryption method. Therefore, HTTPS adopts a mixed encryption mechanism that uses both symmetric encryption and asymmetric encryption.

solving the problem that messages may be tampered with-digital signatures

need to pass through many intermediate nodes in the network transmission process. Although the data cannot be decrypted, it may be tampered with. How to verify the integrity of the data? -Check the digital signature.

Digital signature has two functions:

It can be sure that the message is indeed signed by the sender, because others can't fake the sender's signature.

a digital signature can determine the integrity of a message and prove whether the data has not been tampered with.

How to generate a digital signature:

A message digest is generated from a piece of text by a Hash function, and then a digital signature is generated by encrypting it with the sender's private key, which is sent to the receiver together with the original text. The next step is the process for the receiver to verify the digital signature.

digital signature verification process:

The receiver can decrypt the encrypted digest information only by using the sender's public key, and then use the HASH function to generate a digest information for the received original text, and compare it with the digest information obtained in the previous step. If they are the same, it means that the received information is complete and has not been modified during transmission, otherwise it means that the information has been modified, so the digital signature can verify the integrity of the information.

suppose that message passing takes place between Kobe and James. James sends the message together with the digital signature to Kobe. After receiving the message, Kobe can verify that the received message was sent by James by checking the digital signature. Of course, the premise of this process is that Kobe knows James' public key. The crux of the matter is that, like the message itself, the public key cannot be sent directly to Kobe in an insecure network, or how to prove that the public key obtained is James'.

at this time, it is necessary to introduce a Certificate Authority (CA), the number of CAs is not large, and the certificates of all trusted CAs are built into the Kobe client. CA digitally signs James' public key (and other information) and generates a certificate.

solving the problem that the identity of the communication party may be disguised-digital certificate

The digital certificate certification authority is in the position of a third-party organization that both the client and the server can trust.

Let's introduce the business process of a digital certificate authority:

The server operator submits public key, organization information, personal information (domain name) and other information to the third party CA and applies for certification;

CA verifies the authenticity of the information provided by the applicant through online and offline means, such as whether the organization exists, whether the enterprise is legal, whether it owns the domain name, etc.

if the information is approved, CA will issue an authentication document-certificate to the applicant. The certificate contains the following information: the applicant's public key, the applicant's organization information and personal information, the information of the issuing institution CA, the effective time, the serial number of the certificate and other information in plain text, and also contains a signature. Among them, the algorithm of signature generation: firstly, the information abstract of public plaintext information is calculated by hash function, and then the information abstract is encrypted by CA's private key, and the ciphertext is the signature;

when the Client client sends a request to the Server Server, the server returns a certificate file;

the Client client reads the relevant plaintext information in the certificate, calculates the information digest by using the same hash function, then decrypts the signature data by using the public key of the corresponding CA, and compares the information digest of the certificate. If they are consistent, the validity of the certificate can be confirmed, that is, the public key of the server is trustworthy.

the client will also verify the domain name information, validity time and other information related to the certificate; The client will have built-in certificate information (including public key) that trusts the CA. If the CA is not trusted, the certificate corresponding to the CA cannot be found, and the certificate will be judged illegal.

iv. HTTPS workflow

1. the Client initiates a request for HTTPS (such as https://juejin.im/user/5a9a9cdcf265da238b7d771c), and according to RFC2818, the client knows that it needs to connect to port 443 (the default) of the Server.

2. the server returns the pre-configured public key certificate to the client.

3.Client verifies the public key certificate: for example, whether it is within the validity period, whether the purpose of the certificate matches the site requested by the Client, whether it is in the CRL revocation list, and whether its superior certificate is valid. This is a recursive process until the Root certificate (Root certificate built into the operating system or root certificate built into the client) is verified. If the verification passes, continue, otherwise, a warning message will be displayed.

4. the client uses a pseudo-random number generator to generate a symmetric key used for encryption, and then encrypts this symmetric key with the public key of the certificate and sends it to the Server.

5. the server decrypts this message with its own private key to get a symmetric key. At this point, both Client and Server hold the same symmetric key.

6. the server encrypts "plaintext content a" with a symmetric key and sends it to the Client.

7. the client decrypts the ciphertext of the response with the symmetric key to obtain "plaintext content a".

8. the client initiates the HTTPS request again, encrypts the requested "plaintext b" with the symmetric key, and then the Server decrypts the ciphertext with the symmetric key to obtain "plaintext b".

v. differences between HTTP and HTTPS

HTTP is a clear text transmission protocol, and HTTPS protocol is a network protocol built by SSL+HTTP protocol for encrypted transmission and identity authentication, which is more secure than HTTP protocol.

With regard to security, the simplest metaphor is used to describe the relationship between the two, that is, the truck carries goods, and the van under HTTP is open and the goods are exposed. Https is a closed container truck, and the safety is naturally improved.

HTTPS is more secure than HTTP, more friendly to search engines and conducive to SEO. Google and Baidu give priority to indexing HTTPS pages;

HTTPS needs SSL certificate, but HTTP doesn't;

HTTPS standard port 443, HTTP standard port 8;

HTTPS is based on the transport layer and HTTP is based on the application layer;

HTTPS displays the green security lock in the browser, but HTTP does not;

VI. Why don't all Websites use HTTPS

Since HTTPS is so safe and reliable, why don't all web websites use HTTPS?

first