Mar 21, 2020 - A strategy for effective system modularization

Back in 1972, almost half a century ago, David Lorge Parnas published an iconic paper entitled “On the Criteria to Be Used in Decomposing Systems into Modules” 1. In it he discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time, and also presents a criterion for effectively carrying out the decomposition of a system into modules.

When I first read this paper I was impressed by how relevant and practical it is. I stumbled on it while reading through an online discussion on the topic of object oriented programming. Yet again someone had published a post criticizing the fundamental concepts of OOP and in the discussion there was a comment pointing out to the author that he had misinterpreted the concept of encapsulation in his critique, linking to the paper.

It was in this paper that the concept of information hiding, closely related to encapsulation, was first described. This concept plays a central role in the strategy for effective system modularization, as I’ll describe in the following sections.

Benefits of an effective modularization

Modularization is the division of a system or product into smaller, independent units that work alongside each other for implementing said system or product requirements. In regard to software projects, it is applied to a system at its higher levels of abstraction (ex: microservices architecture) and at its lower levels as well (ex: object oriented class design).

When performed effectively, modularization brings many benefits, among them:

  • Managerial: Development time should be shortened because separate teams would work on each module with little need for communication
  • Flexibility: It should be possible to make drastic changes to one module without a need to change others
  • Comprehensibility: It should be possible to study the system one module at a time, and the whole system can therefore be better designed because it is better understood

A strong indicator that a system has not been properly modularized is not reaping the benefits listed above. The challenge is then to define, and follow, a criterion that leads the system towards an effective modularized structure.

According to D.L. Parnas, one might choose between two distinct criteria for breaking a system into modules:

  1. The Procedural Criterion: Make each major step in the processing a module, typically begining with a rough flowchart and moving from there to a detailed implementation
  2. The Information Hiding Criterion: Every module is characterized by its knowledge of a design decision which it hides from all others

In the next section we will use an example system to demonstrate how following the second criterion leads the system towards a much more effective modularized structure than following the first one, and why the first criterion should actually never be followed alone unless there’s a strong motivation to do so.

Example system: Scheduling Calendar

Consider a Scheduling Calendar system that implements the following features:

  • As an organizer, I want to create a scheduled event so that I can invite guests to attend it
  • As an organizer, I want to be informed of conflicting guests schedules so that I’m able to propose a valid event date
  • As a participant, I want automatic reminders to notify me of upcoming events I should attend so that I don’t miss them

Let’s exercise both criteria for sketching this system’s modularized structure. Notice that I will not be using class diagrams as not to induce an OOP bias in this exercise.

Using the procedural criterion

A straightforward procedure for implementing the event creation feature is:

  1. Read JSON input with the proposed event information (Date, Title, Location, Participants)
  2. Validate against a user_schedules database table that all participants can attend to this new event
  3. In case one or more participants isn’t able to attend, throw an exception informing it, otherwise proceed
  4. Insert an entry in an events table and one entry for each participant in a user_schedules table

We also need to define a procedure for implementing the automatic notification feature:

  1. Setup a notifier task that continuously polls the user_schedules table
  2. Select all user_schedules whose notification_date column is due and notified column is false
  3. For each resulting entry, send an e-mail reminder message to the corresponding event participant
  4. Then, for each resulting entry, set the notified column value with true

The database schema is being loosely defined since it’s not the central point here to discuss it. It’s sufficient to say that, considering a relational database and the third normal form, three tables would suffice the storage necessities of this exercise: events, users and user_schedules.

Based on these two procedures, we might define the following modules for the Scheduling Calendar:

Procedural Modularization

Naturally, following this criterion leads to modules with several responsibilities. The scheduler module is parsing the input, validating data, querying the database and inserting new entries. The notifier module is also querying the database, modifying entries, preparing and sending e-mail messages.

Using Information Hiding as a criterion

Information hiding is the principle of segregation of the design decisions in a system that are most likely to change, thus protecting other parts of the system from extensive modification if a design decision is indeed changed. The protection involves providing a stable interface which isolates the remainder of the system from the implementation.

To apply this principle we start with the system requirements and extrapolate them, anticipating all possible improvement/change requests we can think of that our users, or any stakeholder actually, might ask:

  • Handle different input formats (ex: JSON, XML)
  • Allow the addition and removal of participants after the creation of an event
  • Allow users to customize the frequency of event reminders (ex: single vs multiple notifications per event)
  • Implement different notification types (E-mail, SMS, Push Notification)
  • Support a different storage medium (ex: SQL Database, NoSQL Database, In Memory - for testing purposes)

Hopefully most of them will make sense, but it’s always a good idea to involve a colleague to validate them before making a design decision that might be expensive to change later on.

Now the challenge is to define a system structure that isolates these possible changes to individual modules. Here’s a proposition:

Information Hiding Modularization

As you can see, several specialized modules appeared (in blue), segregating system responsibilities.

The InputParser module hides the knowledge of what input format is being used, converting the JSON data into an internal representation. If we are required to support XML instead, it’s just a matter of implementing another kind of InputParser and plug it into our system.

The Repository modules hide the knowledge of the storage medium from the Scheduler and Notifier modules. Again, if we are required to change persistence to another kind of database we can do so without ever touching Scheduler and Notifier modules. On top of that the specialized repository modules can assimilate data modification and querying responsibilities, making it easier to implement functional changes to events and user schedules.

A MessageSender module is employed for hiding the knowledge of how to send specific notification types. It receives a standardized message request from the Notifier module and sends the corresponding e-mail reminder. If we need to start sending SMS reminders we just have to implement a new kind of MessageSender and plug it to the output of the Notifier.

With the extraction of these specialized modules the original Scheduler and Notifier modules become thinner and take on a new role acting as higher level services, orchestrating lower level modules for implementing system operations. D.L. Parnas reasoned about this hierarchical structure that is formed while decomposing the system, pointing out that it favors code reuse, leveraging productivity. He also warned against lower level modules making use of higher level modules, as it would break the hierarchical structure.

Conclusion

In this exercise I tried to demonstrate how using the information hiding criterion naturally leads to an improved system structure when compared to using the procedural criterion. The latter results in less modules that aggregate many responsibilities, while the former promotes the segregation of responsibilities into several specialized modules. These specialized modules become the foundation of a hierarchical system structure that not only improves comprehension of the system but also it’s flexibility.

The proposed strategy for effective system modularization is then to:

  1. Enlist all operations the system is required to implement
  2. Anticipate possible improvement/change requests for these operations
  3. Identify design decisions likely to change, prioritizing them if necessary
  4. Extract specialized modules that encapsulate these design decisions
  5. Establish and maintain a clear hierarchical structure within the system

The first two steps will help visualize what the system design decisions are, upon which the information hiding criterion (third and fourth steps) is applied. Depending on the scale of enlisted design decisions susceptible to change a prioritization step may come in handy for directing development efforts and maximizing value:

Prioritization matrix

In closing I would like to add another quote from D.L. Parnas own conclusion pertaining this strategy’s third step, in which specialized modules are extracted from the system:

Each module is then designed to hide such a decision from the others. Since, in most cases, design decisions transcend time of execution, modules will not correspond to steps in the processing. To achieve an efficient implementation we must abandon the assumption that a module is one or more subroutines, and instead allow subroutines and programs to be assembled collections of code from various modules.

For me this quote captures the main paradigm shift from procedural to object oriented programming.


Sources

[1] Parnas, D.L. (December 1972). “On the Criteria To Be Used in Decomposing Systems into Modules” (PDF)

Feb 25, 2020 - A mindset for improving the product development workflow

Software developers are responsible for carrying out tasks from a product backlog for delivering increments of value in a regular manner. The mindset we adopt towards work can have a direct impact in the quality and speed of our output. In this article I briefly discuss how adopting a “consultant mindset” internally within your company can help improve the product development workflow.

Backlog Refinement

In an ideal world software developers would only work on tasks from a perfectly refined product backlog. Tasks requirements, user stories, user flows, interface designs, edge cases, etc would be thoroughly detailed, allowing for maximum productivity during development since all required information would be easy to access and readily available.

In practice this is seldom the case. Backlog refinement is an important but often underattended process. Below I provide a usual definition of what it is:

Backlog refinement is the ongoing process of reviewing product backlog items and checking that they are appropriately prepared and ordered in a way that makes them clear and executable for teams once they are planned for development

It can be regarded as a specification process somewhat perpendicular to the software development process, i.e., backlog refinement is driven by the product vision, fit to technical constraints and performed concurrently and in anticipation of development:

Product Backlog

Typical signs of a deficient product backlog are:

  1. Lack of user stories detailing what a feature should do
  2. Missing interface designs defining what a feature should look like
  3. Absence of relevant non-functional requirements (ex: performance, technology stack, etc)
  4. Variability of feature requirements

In the absence of information developers are faced with two choices, either halt development and switch to another task, or work to fill in the gaps themselves to make the task completion feasible. This latter approach can be dangerous if not handled properly, since strategic product decisions could be left open and done on the fly without oversight.

Here enters the consultant mindset, which I propose as a solution to mitigate the risks posed by insufficiently refined backlog items and improve development productivity.

The Consultant Mindset

A consultant-minded developer is expected to work collaboratively within the organization taking part not only of a software project’s technical scope, but also getting involved in its business scope as well. He/she is genuinely interested in helping people understand problems and solve them, acting as more than merely an executor of technical tasks.

Product Scopes

Larger organizations usually have the resources to build fully featured specialized business and technical teams. A deficient product backlog in this case will most likely be the result of problematic internal processes or underperforming professionals.

However, medium and small organizations, specially early stage startups, may not have the resources to maintain these fully featured teams, and team members are often required to hold multiple responsibilities, of both business and technical scopes.

In this context consultant-minded developers are most valuable. Put simply, they are software developers that take backlog items to work on and double check they i) capture business value, ii) have clear acceptance criteria and iii) are technically feasible before starting development.

Upon finding specification issues with the task at hand they actively collaborate with stakeholders seeking more information to understand their perspective, giving expert technical and user experience design advice, which (most) developers naturally acquire over the years working in different projects, and proposing practical solutions for moving forward.

Agile software development frameworks (ex: Scrum) dictates that developers should be shielded from external influences and interruptions as to leverage their productivity. While this is true, it doesn’t mean the development team should be isolated in a silo prevented from reaching out stakeholders when needed, internal or external ones, in the context of backlog items to clear the path for development. As a bonus software developers employing this mindset will grow their vision of the product and potentially perceive their work as more useful and meaningful.

Finally, this simple proactive behavior promotes a culture of action and an environment of open and transparent collaboration beneficial to the company, and even if the motivation for proposing it comes from a scenario of a deficient product backlog, I see no reason why it wouldn’t also add value in a scenario where product backlog refinement is performed appropriately.

Jan 27, 2020 - Mail submission under the hood

A couple of years ago I implemented a mail submission component capable of interacting with SMTP servers over the internet for delivering e-mail messages. Developing it helped me better understand the inner workings of e-mail transmission, which I share in this article.

I named this component “Mail Broker” as not to be confused with “Message Submission Agent” already defined in RFC 6409 along with other participating agents of electronic mail infrastructure:

Message User Agent (MUA): A process that acts (often on behalf of a user and with a user interface) to compose and submit new messages, and to process delivered messages.

Message Submission Agent (MSA): A process that conforms to this specification. An MSA acts as a submission server to accept messages from MUAs, and it either delivers them or acts as an SMTP client to relay them to an MTA.

Message Transfer Agent (MTA): A process that conforms to SMTP-MTA. An MTA acts as an SMTP server to accept messages from an MSA or another MTA, and it either delivers them or acts as an SMTP client to relay them to another MTA.

This Mail Broker is intended for sending e-mail messages directly from within .NET applications supporting alternate views and file attachments. Let’s take a look at some sample usage code (C#) to clear it out:


using (var msg = new MailMessage())
{
    msg.From = new MailAddress("admin@mydomain.com", "System Administrator");
    msg.To.Add(new MailAddress("john@yahoo.com"));
    msg.CC.Add(new MailAddress("bob@outlook.com"));

    msg.Subject = "Hello World!";
    msg.Body = "<div>Hello,</div><br><br>" +
        "<div>This is a test e-mail with a file attached.</div><br><br>" +
        "<div>Regards,</div><br><div>System Administrator</div>";
    msg.IsBodyHtml = true;

    var plain = "Hello,\n\nThis is a test e-mail with a file attached.\n\n" +
        "Regards,\nSystem Administrator";
    var alt = AlternateView.CreateAlternateViewFromString(plain, new ContentType("text/plain"));
    msg.AlternateViews.Add(alt);

    var filePath = @"...\some-folder\pretty-image.png";
    var att = new Attachment(filePath);
    msg.Attachments.Add(att);
		
    var config = new BrokerConfig { HostName = "mydomain.com" };
    using (var broker = new MailBroker(config))
    {
        var result = broker.Send(msg);
        Console.WriteLine($"Delivered Messages: {result.Count(r => r.Delivered)}");
    }
}

The key difference about it is that messages can be sent from any source e-mail address directly to destination SMTP servers, as long as you perform required sender domain and IP address authentication steps to prevent being blacklisted for misuse (more on that bellow).

I’ve put it together integrating a couple of GitHub projects, fixing bugs, and implementing my own code as well. You can access its repository on GitHub here.

A lot of networking, encoding and security concepts were involved in this experiment, which I share in the following sections. I believe they are valuable for anyone trying to get a picture of what happens under the hood when we send an ordinary e-mail.

Internal Structure

The diagram bellow illustrates the Mail Broker internal structure:

mail broker

It is implemented on top of networking and message encoding components. Native framework components / resources such as TcpClient, RSACryptoServiceProvider, SHA256CryptoServiceProvider, Convert.ToBase64String and others were omitted from the diagram for simplicity.

I also took advantage of .NET MailMessage class that can be sent using the now obsolete System.Net.Mail.SmtpClient as it already represents all fields and views of a typical e-mail message.

Sending a Message

Internally the message delivery is performed by the following piece of code:


var mxDomain = ResolveMX(group);

if (!string.IsNullOrWhiteSpace(mxDomain))
{
    Begin(mxDomain);
    Helo();
    if (StartTls())
        Helo();
    foreach (var addr in group)
        results.Add(Send(message, addr));
    Quit();
}
else
{
    foreach (var addr in group)
        results.Add(new Result(delivered: false, recipient: addr));
}

This piece is executed for each destination email address (To, Cc, Bcc). First it resolves mail exchanger records (MX record) for destination addresses grouped by domain, as to find out where to connect for submitting the message. If a MX domain is not found the code marks delivery for this recipient as failed. However, if the MX domain is found it proceeds by beginning an SMTP session with this external mail server.

The Mail Broker will connect to the SMTP server on port 25, receive a “ServiceReady” status code, and start the conversation by sending a HELO command. Then it tries to upgrade this plain text connection to an encrypted connection by issuing a STARTTLS command. If that’s not supported by the mail server it falls back to the plain text connection for delivering the message, which is really bad security wise, but is the only option for delivering the message in this case 😕

We all know that communication on the internet should always be encrypted, but I was amazed to find out how many SMTP servers still don’t support it and communicate in plain text, just hoping that no one is watching! Google’s Transparency Report indicates that today more than 10% of all e-mail messages sent by Gmail are still unencrypted, usually because the receiving end doesn’t support TLS.

Moving forward, after upgrading the connection, the code will issue SMTP commands for individually sending the message to each destination address in this domain group, and after that QUIT the session. The sequence for sending the message is contained in the method showed bellow:


private Result Send(MailMessage message, MailAddress addr)
{
    WriteLine("MAIL FROM: " + "<" + message.From.Address + ">");
    Read(SmtpStatusCode.Ok);

    WriteLine("RCPT TO: " + "<" + addr.Address + ">");
    SmtpResponse response = Read();

    if (response.Status == SmtpStatusCode.Ok)
    {
        WriteLine("DATA ");
        Read(SmtpStatusCode.StartMailInput);
        WritePayload(message);
        WriteLine(".");
        response = Read();
    }
    else
    {
        WriteLine("RSET");
        Read(SmtpStatusCode.ServiceReady, SmtpStatusCode.Ok);
    }

    return new Result(
        delivered: response.Status == SmtpStatusCode.Ok,
        recipient: addr,
        channel: channel,
        response: response
    );
}

This code reproduces the typical SMTP transaction scenario described in RFC5321:

1.    C: MAIL FROM:<Smith@bar.com>
2.    S: 250 OK
3.    C: RCPT TO:<Jones@foo.com>
4.    S: 250 OK
5.    C: DATA
6.    S: 354 Start mail input; end with <CRLF>.<CRLF>
7.    C: Blah blah blah...
8.    C: ...etc. etc. etc.
9.    C: .
10.   S: 250 OK
11.   C: QUIT
12.   S: 221 foo.com Service closing transmission channel

Lines 7 through 9 perform the transmission of the encoded mail message payload. This is handled by the WritePayload(message) and subsequent WriteLine(".") method calls in the code snippet above, which we will get into more detail in the next section.

Encoding the Message

Preparing the message for transmission is tiresome, with lots of minor details to consider. It can be broken down into two parts, the header and the content. Let’s start with the header, the code bellow was extracted from the MailPayload class, and is responsible for generating the encoded message headers for transmission:


WriteHeader(new MailHeader(HeaderName.From, GetMessage().From));

if (GetMessage().To.Count > 0)
    WriteHeader(new MailHeader(HeaderName.To, GetMessage().To));

if (GetMessage().CC.Count > 0)
    WriteHeader(new MailHeader(HeaderName.Cc, GetMessage().CC));

WriteHeader(new MailHeader(HeaderName.Subject, GetMessage().Subject));

WriteHeader(new MailHeader(HeaderName.MimeVersion, "1.0"));

if (IsMultipart())
{
    WriteHeader(HeaderName.ContentType, "multipart/mixed; boundary=\"" + GetMainBoundary() + "\"");
    WriteLine();

    WriteMainBoundary();
}

These are pretty much straight forward:

  • From: Specifies the author(s) of the message
  • To: Contains the address(es) of the primary recipient(s) of the message
  • Cc: Contains the addresses of others who are to receive the message, though the content of the message may not be directed at them
  • Subject: Contains a short string identifying the topic of the message
  • MimeVersion: An indicator that this message is formatted according to the MIME standard, and an indication of which version of MIME is used
  • ContentType: Format of content (character set, etc.)

If the message is multipart, i.e., contains any alternate view or attachment, then there’s a need to define a content boundary for transmitting all content parts, which can have different content types and encodings.

Notice that I’m not using the Bcc header, as RFC 822 leaves it open to the system implementer:

Some systems may choose to include the text of the “Bcc” field only in the author(s)’s copy, while others may also include it in the text sent to all those indicated in the “Bcc” list

There are dozens of other headers available providing more elaborate functionality. The initial IANA registration for permanent mail and MIME message header fields can be found in RFC4021, and is overwhelming. You can see I only covered the most basic headers in this experiment.

Now let’s look at how the message content is being handled:


private void WriteBody()
{
    var contentType = (GetMessage().IsBodyHtml ? "text/html" : "text/plain");
    WriteHeader(HeaderName.ContentType, contentType + "; charset=utf-8");
    WriteHeader(HeaderName.ContentTransferEncoding, "quoted-printable");
    WriteLine();
    WriteLine(QuotedPrintable.Encode(GetMessage().Body));
    WriteLine();
}

The body content can be sent as plain text or as HTML, and an additional ContentType header will indicate which one is used. There’s also a ContentTransferEncoding header present, which in this case is always adopting Quoted-printable, a binary-to-text encoding system using printable ASCII characters to transmit 8-bit data over a 7-bit data path. It also limits line length to 76 characters for legacy reasons (RFC2822 line length limits):

There are two limits that this standard places on the number of characters in a line. Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters, excluding the CRLF

Attachments binary data are encoded to Base64, yet another a binary-to-text encoding system, before being written to the underlying SMTP channel:


private void WriteAttachment(Attachment attachment)
{
    WriteHeader(HeaderName.ContentType, attachment.ContentType.ToString());
    WriteHeader(HeaderName.ContentDisposition, GetContentDisposition(attachment));

    if (!string.IsNullOrWhiteSpace(attachment.ContentId))
        WriteHeader(HeaderName.ContentID, "<" + attachment.ContentId + ">");

    WriteHeader(HeaderName.ContentTransferEncoding, "base64");
    WriteLine();

    WriteBase64(attachment.ContentStream);
    WriteLine();
}

Even though Base64 encoding adds about 37% to the original file length it is still widely adopted today. It may seem strange and wasteful, but, for historical reasons, it became the standard for transmitting email attachments and it would require huge efforts to change this.

The code for writing headers and content related to alternative views was omitted for simplicity. If you’re curious you can find it in the MailPayload class source code on GitHub.

Securing the Message

Suppose the message was encoded and transmitted as described above, how can we trust the destination mail server not to tamper with the message’s headers and contents? The answer is we can’t! There’s nothing preventing it from changing the message before delivering it to the end user so far.

Fortunately, to address this issue we can employ the DomainKeys Identified Mail (DKIM) authentication method:

DKIM allows the receiver to check that an email claimed to have come from a specific domain was indeed authorized by the owner of that domain. It achieves this by affixing a digital signature, linked to a domain name, to each outgoing email message. The recipient system can verify this by looking up the sender’s public key published in the DNS. A valid signature also guarantees that some parts of the email (possibly including attachments) have not been modified since the signature was affixed.

I’ve integrated a DKIM Signer component to the Mail Broker for signing encoded messages before sending them. To use it we need to provide a DkimConfig object to the Mail Broker constructor:


public class DkimConfig
{
    public string Domain { get; set; }

    public string Selector { get; set; }

    public RSAParameters PrivateKey { get; set; }
}

The domain and selector parameters are used by receiving SMTP servers for verifying the DKIM signature header, which perform a DNS lookup on <selector>._domainkey.<domain> for retrieving a TXT record containing the signature public key information, for instance:

"v=DKIM1; k=rsa; t=s; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDDmzRmJRQxLEuyYiyMg4suA2Sy
MwR5MGHpP9diNT1hRiwUd/mZp1ro7kIDTKS8ttkI6z6eTRW9e9dDOxzSxNuXmume60Cjbu08gOyhPG3
GfWdg7QkdN6kR4V75MFlw624VY35DaXBvnlTJTgRg/EW72O1DiYVThkyCgpSYS8nmEQIDAQAB"

This configuration object is passed to the MailPayload class for writing signed content:


private void WriteContent()
{
    if (config != null)
        WriteSignedContent();
    else
        WriteUnsignedContent();
}

private void WriteSignedContent()
{
    var content = new MailPayload(GetMessage());

    var dkim = new DkimSigner(config);
    var header = dkim.CreateHeader(content.Headers, content.Body, signatureDate);

    WriteHeader(header);
    Write(content);
}

As you can see the WriteSignedContent instantiates another MailPayload class without passing the DkimConfig constructor parameter. This results in encoding the mail message using the WriteUnsignedContent method. Once headers and body are encoded it instantiates a DkimSigner class and creates a signature header that precedes the content in the mail data transmission.

Making this DKIM signing component work was the most difficult task in this project. You can take a closer look at how it works from MailPayload unit tests here.

Authenticating Senders

At the beginning of this post I wrote that the Mail Broker can send messages from any source e-mail address, just like Mailchimp can send messages on your behalf without you ever giving it your e-mail password. How’s that possible? you may ask, and the answer is because e-mail delivery is a reputation based system.

But keep this in mind, sending the message doesn’t mean it will be accepted, specially if the sender doesn’t have a good reputation. Here’s what happens when I try to send a message from no-reply@thomasvilhena.com to a Gmail mailbox from a rogue server (IP obfuscated):

421-4.7.0 [XX.XXX.XXX.XXX XX] Our system has detected that this message is
421-4.7.0 suspicious due to the very low reputation of the sending IP address.
421-4.7.0 To protect our users from Spam, mail sent from your IP address has
421-4.7.0 been temporarily rate limited. Please visit
421 4.7.0 https://support.google.com/mail/answer/188131 for more information. e6si8481552qkg.297 - gsmtp

You see, Gmail found the message suspicious, and as I checked it didn’t even reach the destination mailbox spam folder, the sender reputation was so low the message wasn’t even spam worthy!

Here’s what Gmail suggests to prevent messages from being marked as Spam, or not being delivered to the end user at all:

  • Verify the sending server PTR record. The sending IP address must match the IP address of the hostname specified in the Pointer (PTR) record.
  • Publish an SPF record for your domain. SPF prevents Spammers from sending unauthorized messages that appear to be from your domain.
  • Turn on DKIM signing for your messages. Receiving servers use DKIM to verify that the domain owner actually sent the message. Important: Gmail requires a DKIM key of 1024 bits or longer.
  • Publish a DMARC record for your domain. DMARC helps senders protect their domain against email spoofing.

Besides the DKIM authentication method already discussed, there are more three DNS based authentication methods specifically designed for IP address and domain validation on the receiving end, which increase the likelihood that the destination SMTP server trusts you are who you say you are, and not a spammer / scammer.

These authentication methods along with strictly following mailing best practices (ex: sticking to a consistent send schedule / frequency, implementing mailing lists opt-in / unsubscribe, constantly checking blacklists, carefully building messages, etc) will help, bit by bit, to build a good sender reputation and having e-mail messages accepted and delivered to the destination users inbox.

Delivery Test

This article wouldn’t be complete without a successful delivery test, so I followed the steps presented in the previous sections and:

  1. Created an EC2 instance running on AWS
  2. Created a SPF record for that instance’s public IP Address
  3. Generated a private RSA key for signing messages
  4. Turned on DKIM by creating a DNS record publishing the public key
  5. Ran the Mail Broker from the EC2 instance sending a “Hello World” message

This time the response from Gmail server was much more friendly:

250 2.0.0 OK 1580057800 f15si8371297qtg.4 - gsmtp

Nevertheless, since my domain has no mailing reputation yet, Gmail directed the message to the Spam folder:

Gmail Spam

Here are the delivery details:

Gmail summary

Notice that standard encryption (TLS) was used for delivering the message, so the connection upgrade worked as expected.

What about SPF authentication and DKIM signature, did they work? Using Gmail’s “show original” feature allows us to inspect the received message details quite easily:

Gmail headers

Success! Both SPF and DKIM verifications passed the test ✔️


Building a mail delivery system from scratch is no simple feat. There are tons of specifications to follow, edge cases to implement and different external SMTP servers that your system will need to handle. In this experiment I only implemented the basic cases, and interacted mainly with Gmail’s server which is outstanding, giving great feedback even in failure cases. Most SMTP servers out there wont do that.

Mail transmission security has improved a lot over the years, but still has a long way to go to reach acceptable standards.

It was really fun running this experiment, but I must emphasize it’s not intended to production use. I guess I will just stick to a managed mail sending API for now 😉