程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python grpc practice (2) -- Protocol buffer

編輯:Python

「 This is my participation 2022 For the first time, the third challenge is 2 God , Check out the activity details :2022 For the first time, it's a challenge 」

Preface

In the last article Python-gRPC practice (1)--gRPC brief introduction A brief introduction to gRPC Adopted HTTP2 As its transport protocol , as well as gRPC How to pass HTTP2 The transmission of data , This article focuses on gRPC Serialization protocol used --Protocol Buffer.

1. Introduce

1.1. What is? Protocol Buffer

Protobuf(Google Protocol Buffers) yes Google Cross language development , Cross platform , Scalable , Data transfer protocol for serializing structured data , At present, it has been widely used in data transmission between server and client , In the project gRPC If you use it well, you must first understand it clearly Protocol Buffer The use of and grammar .

NOTE: Protobuf It's like Json It can also be used independently , Not limited to gRPC In this scene , We can base it on Protobuf Realize your own data serialization / Deserialization .

1.2.gRPC Why use Protocol Buffer Make serialization Protocol

gRPC In the early days, only Protobuf, The latest version has begun to support Json 了 , But not many people use . Why? gRPC At first, choose Protobuf Well , One of the important reasons is Protobuf It is also Google's own product , such gRPC When upgrading functions ,Protobuf It can also iterate in time , at present Protobuf The version of has been iterated to 3 edition , But only the second edition and the third edition can be accessed , Because the first version was used internally by Google . however gRPC use Protobuf The important reason is that in common scenes ,gRPC It is more efficient than what we use now Json higher ,Protobuf Why is your efficiency high ? There is no free lunch , There are gains and losses , Understanding Protobuf Before that, let's look at a passage Json data :

{
"project": "Test",
"timestamp": 1600000000,
"status": true,
"data": [
{
"demo_key": "fake_value",
},
{
"demo_key": "fake_value",
},
{
"demo_key": "fake_value",
}
]
}
 Copy code 

This paragraph Json Data is a piece of text , This is it. Json The first point of inefficiency -- Coding inefficiency . For example, fields status Corresponding value true In memory, only 1 Bytes , But in this data, it takes 4 Bytes , Another example is the field timestamp The value of is int type ,int Types do not occupy much space in memory , But in Json Data is rendered as strings, which takes up more space . In addition, we can quickly see what is in this data , This is a Json An advantage of , But it also brings another disadvantage -- Information redundancy . For example, fields data The data of is an array , But the structure inside is consistent , In this way, it will repeat and pass more n Secondary field name .

Protobuf To solve these problems , Firstly, some coding schemes with optimization are introduced , It solves the problem of inefficient coding , For example, it introduces VarInts Encode and decode numbers , This scheme can save digital space , At the same time, bit operation is used to encode and decode , Very efficient , The details can be obtained through Detailed explanation varint Coding principle Get to know . The other improvement is to remove the field name , Use field numbers instead , When transmitting, only the number is transmitted , In this way, the redundancy problem can be solved , But at this time, both parties need to have a translation of the record number, so that the real field name can be obtained through the field number , Just like moss code communication , And in the Protobuf in proto File is such a codebook , It records the relationship between fields and numbers, as well as the interface and service to which the request belongs . So above Python-gRPC practice (1)--gRPC brief introduction The result of catching bags in As an example , The picture shows the relationship with proto file :

syntax = "proto3";
package user;
import "google/protobuf/empty.proto";
// delete user
message DeleteUserRequest {
string uid = 1;
}
service User {
rpc delete_user(DeleteUserRequest) returns (google.protobuf.Empty);
}
 Copy code 

Relevant requests , The request indicates Field by 1, The value is 999, After receiving the request, the receiver will start from proto Document check data , adopt URL Get this request is service by User, rpc by delete_user Request , So the requested message Namely DeleteUserRequest, Then you will know Field by 1 The actual field name is uid.

2. Use Protocol Buffer

Protobuf The coding principle of is worth seeing , At present, there are many online materials , Let's skip here and go directly to how to use Protobuf( Actually, I'm right now Protobuf I don't know much about the coding of - -).

As can be seen from the above example gRPC When running, you need proto File to get the true field data , and gRPC It's multilingual , Well, for each language gRPC How to pass proto File to find out the data .

When we write the project , Most of them will generate the corresponding through an interface code OpenAPI file , Then other tools such as Swagger You can read the file and render a API file . and proto The role of documents is also similar to OpenAPI File similar , It's just not code generation , It's written by developers , Then the developers use different tools to base on proto File generates code in different languages and puts it into the project for use , So make good use of gRPC You need to know how to write proto file ( Usually in use gRPC when , Is based on Protobuf File generates the corresponding calling code ).

2.1.Protobuf grammar

Before introducing grammar , Have a look first proto What is the content of the document , First of all, let's look back at the above proto file :

syntax = "proto3";
package user;
import "google/protobuf/empty.proto";
// delete user
message DeleteUserRequest {
string uid = 1;
}
service User {
rpc delete_user(DeleteUserRequest) returns (google.protobuf.Empty);
}
 Copy code 

The standard proto Like this sample document, the document can be divided into three parts , The first part is the first three lines , This part is proto Declaration area of the document , The first line indicates the current proto The syntax of the file is proto3( There is no special explanation , The grammar introduced in this article is proto3), The second line indicates that the package name of the file is user, This will facilitate other files to import the definition of this file , The third line indicates import empty.proto file , Next, you can use it in this file empty.proto What the file defines .

The second part is 5-8 That's ok , This part is the message body area , Here's a definition called DeleteUserRequest The body of the message , A message named uid Field of , And its type is string, The field sort is 1. In actual development , Most of the changes take place in this part , And there are many points needing attention .

The third part is 11-13 That's ok , This part is the service definition area , Here's a definition called User Service for , One of the services is called delete_user Methods , And the request accepted by this method is DeleteUserRequest Message body , The response is Empty Message body . It can be simply understood that this part defines a class , At the same time, define some methods for each class , These methods only have function signatures , No specific implementation .

Understand the finished Proto After the file structure , You can start to understand Protobuf grammar .

2.1.1. Field number

When writing the message body , The most important point is the field number , As can be seen from the previous description , Protobuf Serialization of is translated by field number , So we should ensure that the field number and field are one-to-one correspondence , Generally, we should follow the field number from 1 Gradually increasing , For example, the following message body :

message DemoRequest {
string uid = 1;
string mobile = 2;
int32 age = 3;
}
 Copy code 

Its field numbers are gradually increasing , When adding new fields later, you should also specify the field number in an incremental way , Never reuse field numbers that once existed , Even if a certain field is reconstructed , For example, change the above message body :

// Usually, fields that have been used are not deleted , This is just a demonstration
message DemoRequest {
string uid = 1;
string mobile = 2;
int32 brithday = 4; // Uniformly use timestamp to represent date
}
 Copy code 

Although in the changed message body age Fields are brithday To replace the , however brithday The field number of is still increasing 1, This can prevent data parsing exceptions caused by the old version of the client when it does not change with the server .

However, using the method of increasing the field number can let developers know where the previous number is used , But these need to rely on the norms of the team to achieve without problems , So Protobuf Provides reserved Field , Let's mask some field numbers that can't be used later , Examples are as follows :

message DemoRequest {
string uid = 1;
string mobile = 2;
reserved 3;
int32 brithday = 4; // Uniformly use timestamp to represent date
reserved 5, 6, 10 to 15 // reserved You can also limit multiple field numbers at once , They are in `,` Separate , You can also use `xx to xx` To limit a continuous field number .
}
 Copy code 

This example can avoid the use of field numbers in subsequent fields 3, That is to use Protobuf The compiler will also report errors , Prevent problems at the source .

NOTE The reason for requiring field numbers from 1 It starts to increase because Protobuf from message When encoding into binary message body , Field number 1-15 Will take up 1 Bytes ,16-2047 Will take up two bytes , priority of use 1-15 The field number of will reduce the transmission of data , If there are many fields in the message body at the beginning , You need to arrange the field numbers of commonly used fields in 1-15 Between . Besides ,19000 To 19999 It's for protocol buffers Realize the reserved field label , Definition message Can not be used when , If these numbers are used ,Protobuf The compiler will report an error .

2.1.2. How to use

stay Protobuf In the body of the message , The type of each field is fixed , Because the fixed type of transmission can reduce the occupation of transmission resources , So when we define the fields of the message body , The type of field must be defined in combination with business requirements , Here is a common Protobuf The basic field type is the same as Python Type cross reference table :

Protobuf type Python type Protobuf Type specification doublefloatfloatfloatint32int Use variable length encoding , This type is not good at dealing with negative numbers , Need to use sint32 Instead of int64int Use variable length encoding , This type is not good at dealing with negative numbers , Need to use sint64 Instead of unit32int Use variable length encoding unit64int Use variable length encoding snit32int Good at dealing with negative numbers , When negative numbers may appear in this field , Need to put int32 Change to this type snit64int Good at dealing with negative numbers , When negative numbers may appear in this field , Need to put int64 Change to this type fixed32int Always 4 Bytes , If the value is always greater than always than 228 Big words , This type is better than uint32 Efficient , It's equal to int32,uint32,float Union fixed64int Always 8 Bytes , If the value is always greater than always than 256 Big words , This type is better than uint64 Efficient , It's equal to int64,uint64,double Union boolboolstringstrbytesbytes

It should be noted that , Although the declared field does not indicate its value , But they all have default values :

  • String type : An empty string
  • Byte type : Empty bytes
  • Numeric type : 0
  • enum: The first element of the default value , And the value must be 0

meanwhile , The defined message body is also Protobuf A type in , This type is called Message, It can be nested in other Message in , Protobuf The grammar is as follows :

message DemoSubRequest {
string a = 1;
int32 b = 2;
}
message DemoRequest {
DemoRequest result = 1;
}
 Copy code 

It can also pass through import The grammar of , from a File import message body to b file , And be b Files use , For example, under the folder project Yes a Document and b file , among a The documents are as follows :

// Declare the package name as demo_a
package demo_a;
// Define a message body
message DemoRequest {
DemoRequest result = 1;
}
 Copy code 

and b The document references a The message body of the file , The specific code is as follows :

// Declare the package name as demo_b
package demo_b
import "project/demo_a.proto";
message DemoRequest {
// quote a The message body of the file
project.demo_a.DemoRequest result = 1;
}
 Copy code 

Besides , Protobuf It also supports defining other types , These types have the following Python Usage of equivalent types , But there are still some differences when using :

  • Timestamp:

    Timestamp yes Protobuf The type of time in ,Protobuf The syntax is as follows :

    import "google/protobuf/timestamp.proto";
    message DemoRequest {
    google.protobuf.Timestamp timestamp = 1;
    }
     Copy code 

    This type is actually timestamp Encapsulation , Its default value is timestamp=0( The corresponding date is 1970-01-01), stay Python In the code , It can be done through grammar ToDatetime To datetime, You can also use grammar FormDatetime hold datetime To Protobuf Of Timestamp:

    from google.protobuf.timestamp_pb2 import Timestamp
    Timestamp().ToDatetime()
    from datetime import datetime
    Timestamp().FormDatetime(datetime.now())
     Copy code 
  • Repeated:

    Repeated This field table can be repeated any number of times , It's like Python Of Sequence object , But in fact, it can be regarded as Python Of List object ,Protobuf Use Repeated The grammar is as follows :

    message DemoRequest {
    repeated int32 demo_list = 1;
    }
    // demo_list value like json
    // [1, 2, 3, 4, 5, 6]
     Copy code 

    The message body defines a demo_list Field , The field is repeated And the internal type is int32, stay Python Use in Repeated Field method and use List The method is the same , But it is not inherited from List Of , Some libraries may need to be converted to List Can be used , such as pymysql.

  • Map:

    Although most of us are clear Key-Value To define the message body , however Protobuf It also provides a similar dict Of Map,Protobuf Use Map The grammar is as follows :

    message DemoRequest {
    map<string, int32> demo_map = 1;
    }
    // demo_map value like json
    // {
    // "aaa": 123,
    // "bbb": 456
    // }
     Copy code 

    The message body defines a demo_map Field , The field is map Type and key Type is string ,value The type is int32, stay Python Use in Map Method and use dict The method is the same , But it is not inherited from dict Of , Some libraries may need to be converted to dict Can be used , such as pymysql.

    NOTE:

    • Map The field of type cannot be Repeated, because Repeated Is variable , It's like Python in Dict Of Key It can't be List equally .
    • Map The fields of are unordered .
    • If there are duplicate fields , Then there is one in the end .
  • Empty:

    Empty yes Protobuf Represents empty type in , Follow Python Medium None equally , Generally, it is not used in the message body , It is used to mark a rpc Method returned null ,Protobuf The grammar is as follows :

    import "google/protobuf/empty.proto";
    service Demo {
    rpc demo (DemoRequest) returns (google.protobuf.Empty);
    }
     Copy code 

    stay Python Through from google.protobuf.empty_pb2 import Empty Import Empty Object and use , But in the Python It's best not to put Empty To Python Of None object , because Empty It is only used to represent that the response of the request point is empty .

  • Enum: When defining message types , You may want one of the fields to have only one predefined value , Enumeration types are used at this time ,Protobuf Use Enum The grammar is as follows :

    message DemoRequest {
    enum Status {
    open = 0;
    half_open = 1;
    close = 2;
    }
    Status status = 1;
    }
     Copy code 

    As the grammar shows , First, create a message named Status Enumerated type of , Then define the type as Status Field of status, It is worth noting that enumeration definitions need to contain a constant mapping to 0 And as the first line of the definition , This is because Protobuf It is required that the value of a field in the defined enumeration value must be 0, When there is no default value defined for the field referenced to this type , Its default value is that the value of the enumeration type is 0 Field of .

2.2.Proto Document management and use specifications

The actual use gRPC When connecting services , These services do not use only one programming language , Some services may use Python Written , Some services are Java Written , Some services use Go Written . meanwhile , Not all services need to be updated when we release functions , Some services only need to use the old interface , For example, a server interface has been updated , This server corresponds to many clients , If there is no standardized management proto Word of the file , It is possible that all clients need to be upgraded , Instead of just upgrading the client that needs to be upgraded , So we need to manage according to the specification proto file , Reduce the burden of management .

2.2.1. Options

At the beginning , The scheme I choose is the simplest file copy , This is also the way most people use when getting started , It's very easy to use , But the code reuse rate is very low , Copying files will become a burden when there are many projects , Sometimes you need to use diff Tools to compare , Very troublesome .

therefore , Later, I began to consider using version management tools to manage , because proto Files are a subset of the project , When choosing a plan, you will first think of Git Submodul, But this scheme has the risk of rolling back the point of failure , At the same time, we need to produce corresponding proto, More trouble .

The final plan is to build a new git Warehouse to store proto file , And tag To distinguish different versions . Use git Warehouse has another advantage that can be used CI/CD According to proto The file generates the code of the corresponding language and packaging , Some manual steps omitted .

2.2.2. Use

First we need to create a Git Warehouse , Put the Proto The files are moved out and become a separate warehouse , Then according to git flow Process to update proto file , But it's updating Proto The following specifications should be followed when filing :

  • proto The document only increases but not decreases
  • proto The interface of the file is only increased but not decreased
  • proto Of documents message Fields are only increased but not decreased
  • proto In the document message Field type and sequence number cannot be modified

The common feature of these specifications is not to delete the source file , Every time, only add , So as to ensure that even if proto The file has changed , The old service can still be used normally without updating .

After the update, it can be used by other projects , For example, the current version of this library is 1.0.0, We according to the git flow Process to update proto File and generate the corresponding language code or release package , Finally, the corresponding tag label , about Python You can use this method to install or update dependencies :

pip install https://gitlab.xxx.com/proto/[email protected]
 Copy code 

And for Java This kind can be packaged into a release Version to maven Use .

3. Last

Now I have a preliminary understanding gRPC as well as Protobuf How to use , Next, a simple project will demonstrate how to use gRPC


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved