Increasing The Capacity of Headstega Based on Bitwise Operation

Headstega (Head steganography) is a noiseless steganography that used email headers as a cover for concealing messages. However, it has less embedding capacity and it raises suspicion. For overcoming the problem, bitwise operation is proposed. In the proposed method, the message was embedded into the cover by converting the message and the cover into binary representation based on a mapping table that was already known by the sender and the receiver. Furthermore, XOR bitwise operations were applied to the secret message and cover bits based on random numbers that were generated using a modular function. Moreover, the result was converted into characters that represent the secret message bits. After embedding the message into the cover, an email alias was generated to camouflage the secret message characters. Finally, the sender sends the embedded cover and the email alias to the recipient. Using the proposed method, the embedding capacity is 89% larger than using the original Headstega. For reducing the adversary’s suspicion, the existing email address was used instead of creating a new email address.


Introduction
Recently, the use of communication and information exchange are increasing. Since important information should be secured during information exchange, security against cheaters or eavesdroppers is necessary [1]. One method for securing information is steganography [2]. There are several types of steganography based on the covers used for camouflaging secret messages such as audio steganography [3], text steganography [4,5,6], image steganography [7,8], and video steganography [9,10]. Steganography can be classified as noisy [7,9] and noiseless [11,12]. In this research, noiseless steganography is observed. One of noiseless steganography is Headstega proposed by Desoky. Headstega is a steganography that camouflages data in an email header [12].
The drawback of Headstega is that the less embedding capacity and it raised suspicion since for embedding a message, a new email address is necessary. The capacity is low because the secret message was embedded into the first alphabetical character of the email address and also into numerical character in the email address. In other words, 4 bits of the secret message are embedded into the first alphabetical character of the email address while the next 7 bits are embedded in the numerical character, such that this method conceals 4 -11 bits in each email address.
For overcoming the drawback, bitwise operation for embedding the message into the cover was proposed. In this case, the cover was an existing email address. The basic idea of the method was embedding a message into the cover based on mapping table. The embedding process was done by converting the message and the cover into binary representation based on a mapping table that was already known by the sender and the recipient. Furthermore, bitwise operations were performed on these bits based on random number that was generated using a modular function. After embedding the message into the cover, the binary numbers that represent the secret message were converted into characters and then email alias was generated based on these characters to camouflage the secret message characters. Finally, the sender sent the embedded cover and the email alias to the recipient.
The experiments were performed using 30 different covers and 30 messages of various lengths. The results of the experiment showed that the proposed method has a higher embedding capacity than the previous one. The proposed method can embed messages 89% larger than the previous method. Moreover, the previous method required the cover 95% higher than the proposed issue 2, June 2021 method when observed in terms of the number of covers used. For reducing the adversary's suspicion, the existing email address is used. Thus, a new email address generation process is not necessary.

Related Work
Headstega is a method in Noiseless Steganography area or Nostega that camouflages messages in a part of email header such as email address, subject, cc, etc [11,12]. Headstega consists of two main methods, encoding, and decoding. The encoding method consists of message encoding and message camouflaging. The architecture of the Headstega is shown in Fig. 1.   Fig. 1. The architecture of Headstega [12] Message encoding is a method for encoding a message into an appropriate form based on the encoding parameters and steganographic coding map that was predefined. The secret message is converted and grouped into a certain length of a binary number. Furthermore, the binary representation is converted into a letter based on the coding map for steganography. The encoded message is camouflaged to generate the stego cover (email header) which conceals that encoded message. For example, according to the coding map in Table 1 [12], four bits slice "0111" represented the letter "H or X". Thus, the first characters of the email address that will be generated for embedding 0111 are letter H or X.
Message decoding is used for extracting the encoded message from the email address. For example, the receiver receives an email which consists of a group of receivers whose email addresses are "Hasmawati@test.xyz", "Erisdian@test.xyz". The secret message is extracted from these email addresses based on the same coding map used in message encoding method. Since the first character of the first email address consists of letter "H", then, "0111" is extracted. For the second email address "Erisdian@test.xyz", "E" is found as the first letter, then the binary number extracted from E based on the code mapping table is "0100". If those binary codes are concatenated, then "01110100" is obtained. Since this binary representation of the encoded message is equal to 116 in ASCII code, then it will be translated into the original message 't'.

Proposed Method
For increasing the capacity of Desoky's method [12] as has been discussed in section 1, the bitwise operation was introduced in the proposed method. The architecture of the proposed method is shown in Fig. 2. The main difference between Desoky's and the proposed method is that the cover generation was not necessary in the proposed method because the existing email address can be used as the cover.
Suppose Alice sent a secret message to Bob by email, then Alice embedded the secret message into the email address. For embedding the secret message, Alice needed a code mapping table such  that Bob could extract the message using the same  code mapping table. Therefore, the code mapping  table should be agreed upon in advance by Bob and  Alice. The embedded message was represented as  a sequence of characters. For extracting the  message, Bob used the code mapping table and the  characters to obtain the secret message. The proposed method consisted of two main processes, encoding, and decoding. The sender conducted message encoding by embedding the secret message into a cover (existing email address). The receiver conducted message decoding by extracting the secret message from the cover.

1. Message Encoding Process
The basic concept of message encoding was converting the secret message from letters into five bits binary codes based on the mapping table (table 2) that was already known by both parties. All five bits binary codes were concatenated to obtain the message bit stream. After converting the secret message into the message bits stream, the email address was also converted into five bits binary codes and concatenated all email addresses to obtain the cover bits stream.
The next process was XOR-ing the secret message and the cover bits stream. For increasing the security, the bit location where the XOR process starting point should be randomized, such that it was not easy for the attacker to find the location of the embedded secret message. For indicating the location of the embedded secret message, the random number generation was necessary. The random number was calculated based on the modulo function of the number of characters used in the mapping table, the time when the email was sent, the length of the secret message, and the length of the cover.
Finally, the result of the XOR process were encoded into a character based on Table 2 and then generated email alias to camouflage the secret message characters. The overview of message encoding is depicted in Fig. 3.

2. Converting the Secret Message and the Cover into Five Bits String
The objective of this algorithm was to convert the secret message and the email address into a binary numbers. The conversion process consists of the following steps.
1. Each character of the email address and secret message were converted into five bits binary code based on Table 2. 2. Concatenating all five bits binary codes of the secret message to represent the secret message in binary form. A similar process was applied on the cover. The algorithm for converting the secret message and the cover into five bits binary codes is depicted in Fig. 4. The character to five bits string mapping table as shown in Table 2 consisted of 26 alphabetic characters and some special characters. These special characters were selected based on the occurrence frequency of the specific characters in email addresses and words in Indonesian language. Generally, email addresses consist of a sequence of alphabetic characters and some special characters such as commas, periods, spaces, @ signs, underscores, and question marks (?), as well as words in Indonesian language. The output of this algorithm was the binary form of secret messages and covers.

Embedding Secret Message
The objective of this algorithm was to embed the secret message using the XOR operation. This process consists of the following steps : 1. Counting the length of secret message and cover bits stream 2. Determining the beginning location of the XOR process using equation (1) where is the starting location of the XOR process, is the number of characters used in the mapping table, is the time when the email was sent, is the length of the cover bit stream, and is the length of the secret message bits stream. If + > then the XOR process for the remainder was started from the first character of the cover. 3. XOR the secret message bits with the cover bits starting from the location that was obtained from step 2. The output of this algorithm is the embedded secret message in binary form. The algorithm for embedding the secret message is shown in Fig. 5.

4. Converting the Binary Form of Embedded Secret Message into Character
The objective of this algorithm was for converting the binary of embedded secret message into the seed of alias by converting it into a character based on Table 2. The process for converting the binary of embedded secret message into character consists of the following steps: 1. Slicing the binary of embedded secret message into a group of five bits. 2. Each five bits string was converted into a character based on Table 2. This character is the alias seed character. The output of this algorithm was the alias seed characters. The algorithm for converting the binary of embedded secret message into character is shown in Fig. 6.

5. Generating Email Alias
The objective of this algorithm was to generate an email alias representing the character that was sent to the recipient. The process for generating alias consists of the following steps.
1. Determining the rules for generating the email alias using equation (2) = (2) where is the result of the modular function (the number for choosing the rule used to construct the email alias), is the number of characters used in the mapping is the time when the email was sent, is the length of the cover bit stream.

If the result of equation 2 is even, then
generate the email alias based on rule A, where rules A is as follows: 1) If the first letter of the alias seed characters was a vowel, then added one consonant after a vowel 2) If the first letter of the alias seed characters was a consonant, then added one vowel after a consonant, 3) For the second until the last character of the alias seed characters, added one vowel before a consonant or one consonant before a vowel 4) For special characters, an additional character is not necessary 3. If the result of equation 2 is odd then generate the email alias based on rule B, as follows: 1) If the first letter of the alias seed characters was a vowel, then added one consonant before a vowel. 2) If the first letter of the alias seed characters was a consonant, then added one vowel before a consonant. 3) For the second until the last character, added one vowel after a consonant or one consonant after a vowel 4) For special characters, an additional character was not necessary The output of this algorithm was the email alias that will be sent to the recipient. The algorithm for generating email alias is shown in Fig. 7.    Figure 7. Since the result of modular function using equation 2 is even then the generating process was performed using the algorithm in rules A. The result is boip@amiwaqux.

7. Message Decoding Process
The basic concept of message decoding was extracting the secret message from the stego email header. The message decoding process started by extracting the received email alias to obtain the alias seed characters. Furthermore, all characters should be converted into five bits binary code to obtain the binary string of alias seed characters. Similarly, the email address was also converted into five bits binary code to obtain the cover bit stream. The next process was extracting the binary string of secret message by performing the XOR operation between the alias seed character bit stream and the cover bits stream. Before the XORing process, random number generation should be conducted to indicate the beginning location of the XOR-ing process. The result of the XOR-ing process were sliced into five bits binary codes. Furthermore, the five bits binary codes were converted into character. The overview of the message decoding process is shown in Fig. 8.

8. Extracting the Alias Seed Characters
The objective of this algorithm was to extract the alias seed characters from the email alias received by the receiver. The extracting process consists of the following steps.
1. Determining the rule for extracting the alias seed characters using equation ( 3) Repeating step 2) until the last character of the email alias The algorithm for extracting the alias seed characters is shown in Fig. 9.

9. Converting the Alias Seed Characters and the Cover into Five Bits String
The objective of this algorithm was to convert the alias seed characters and the cover into five bits binary code. The conversion process consists of the following steps.
1. Each character of the alias seed is converted into five bits binary code based on Table 2. 2. Concatenating all five bits binary code of the alias seed characters to represent the alias seed characters bit stream. 3.
Step 1 and 2 are also applied to the cover.
The output of this algorithm was the cover bits stream and the alias seed character bit stream. The algorithm for converting the character and the cover into five bits binary code is shown in Fig. 10.

10. Extracting the Secret Message
The objective of the algorithm was to extract the secret message using the XOR operation. The process for extracting the secret message consists of the following steps.
1. Counting the length of the embedded secret message bit and cover bits stream. 2. Determining the beginning location of the XOR process using equation (3) = ( − ) (3) where is the starting location of the XOR process, is the number of character on the mapping table, is the time when the email was sent, is the length of the cover bit stream, and is the length of the alias seed character bit stream. If + > then the XOR process for the remainder is started from the first character of the cover. 3. XOR the embedded secret message bits with the cover bits starting from the location that is obtained from step 2. The output of this algorithm was the secret message bits stream. The algorithm for extracting the secret message is shown in Fig. 11.

11. Converting the Secret Message Bits Stream into Characters
The objective of this algorithm was to convert the secret message bits stream into characters. The process for converting the secret message bits stream into characters consists of the following steps: 1. Slicing the secret message bits stream into a group of five bits. 2. Each group is converted into a character based on Table 2. The algorithm for converting the secret message bits stream into character is shown in Fig.  12.

12. Implementation of Message Decoding
Suppose the email alias was 'boip@amiwaqux', the time when the email was sent is 01:26, the cover was 'hasmawati@telkomuniversity.ac.id' then, for extracting the secret message from alias and email address, the following process should be conducted.
1. Extracting the alias seed characters from the email alias using Algorithm 5. Since the result of modular function using equation 2 was even, then the extracting process is performed based on rules A. The alias seed characters obtained was bp@mwqx. 2. Converting 'bp@mwqx' into five bits binary code based on

Experiment and Analysis
To evaluate the embedding capacity of the proposed and the Desoky's methods, two experiments were conducted. The first experiment was to calculate the number of covers used for embedding the message, and the second was to calculate the embedding capacity of the proposed and Desoky's method. Both of experiments were conducted using 30 covers and 30 messages of various lengths.
For conducting the experiment using Desoky's method the secret message were divided into two sentences. The first sentence was embedded in the first letter of the email address. The second sentence was embedded in the numeric characters of the email address. The cover used was an email address taken from a valid email address databases which is about 5,307 email addresses.

Evaluating Cover's Size For Embedding Messages
The objective of the experiment for evaluating the cover's size for embedding messages was to evaluate the number of covers required for embedding a message using the previous and the proposed method. The experiment result shows that to embed a message with a length of , using the previous method required 5% of email addresses larger than the proposed method. This condition was occurred because using Desoky's method, one email address could be embedded with a maximum of 11 bits of the secret message (by assuming that the character represented by 4 bits on the message were matched with the first character of the email address, such that these 4 bits could be embedded into the first character of the email address, and the numeric character represented by 7 bits on the message were matched with the three numeric characters of the email address such that these 7 bits could be embedded into the numeric character of the email address). Meanwhile, using the proposed method one email address could be embedded with a minimum of 16 bits (by assuming that one email address could be embedded with a minimum of two characters or 16 bits). The result of the experiment for evaluating the cover's size for embedding messages is shown in Table 3.

Evaluating of the Embedding Capacity
The objective of the experiment for evaluating the embedding capacity was to evaluate the number of secret message characters that can be embedded into the cover using Desoky's method and the proposed method. In this case, the cover used for both methods was similar. The experiment was conducted using 30 covers and 30 messages of various lengths. Since some covers could be embedded with a similar length of the secret message, then these covers were classified into 10 classes.
The result of the experiment shows that the number of characters that can be embedded using the proposed method was larger than using Desoky's method. In general, the proposed method could embed messages 89% larger than the Desoky's method.
Since the embedding capacity depended on the maximum number of bits that could be embedded into a cover divided by the number of bits of the cover, then the maximum capacity was calculated using equation 4.
where is the maximum number of bits that can be embedded into the cover, and is the number of bits of the cover which can be calculated using equation (5).
Using Desoky's method, one email address could be maximum embedded by 11 bits of message, then the maximum embedding capacity was 11 bits, while using the proposed method, one message could be maximum embedded by bits. Thus, the embedding capacity using Desoky's method was 11/ % while using the proposed method is 100%.
Since the number of characters in one email address was larger than or equal to two characters, then using the proposed method, the minimum capacity was (10/ ) * 100%. Meanwhile, using Desoky's method, the minimum capacity was ( 4 ) * 100%. Thus, it was shown that the maximum embedding capacity when using Desoky's method was less than the minimum embedding capacity when using the proposed method. Therefore, it can be concluded that the embedding capacity using the proposed method was larger than Desoky's method.
The results of the experiment for evaluating the embedding capacity is shown in Fig. 13.

Imperceptibility
Imperceptibility is calculated/identified as the probability for an attacker to detect the existence of a hidden message in the cover message. The hidden message should be invisible and should not raise the suspicions of human vision systems. Practically, imperceptibility was evaluated by analyzing the degree of imperceptibility. The degree of imperceptibility was obtained by comparing the cover message and the cover message after a message was embedded into it. The difference between those two messages was calculated using Jarro distance [14,15,16] (see equation 6).
where is the number of matching characters between those two messages, 1 is the length of the cover message, 2 is the length of the cover message after the embedding process, and is half the number of transpositions.
Since either Desoky's and the proposed method was a noiseless steganography, then there was no difference between the cover message and the cover after embedding process, or in other words, = 1 = 2 , = 0 (because of no transpositions). Thus, in this case, was 0.

Robustness
During the transmission process, multiple attacks may occur on the stego-text, such as text manipulation and deletion parts of the stego-text, by attackers. A robust steganography method should have the capability to make it difficult for attackers to change or destroy a hidden message. This capability could be measured based on the probability of how much proportion of the embedded message that had been lost from the stego text, ( ). Suppose the number of embedded locations in the cover message was and the length of the cover message was . Thus, the ( ) = / and the ( ) can be calculated as follows [14,15,16]: Since using the proposed method all locations of the cover message could be embedded, then if all locations were used, then the loosing probability could be maximum 1. However, since the lower losing hidden message probability leads to a more robust steganography methods, then not all locations could be embedded. Suppose, 11 bits characters (11 bits is the maximum embedding capacity using Desoky's method) was embedded into the cover message using Desoky's method (7 bits were embedded into 3 number characters and 4 bits were embedded into one character), then the probability of losing the hidden message ( ) = 4/ , while using the proposed method ( ) = 3/ (since one character could be embedded by 5 bits, then for embedding 11 bits, 3 characters were necessary). Thus, the loosing probability for embedding 11 bits using the proposed method was less than the loosing probability when using Desoky's method. Since ( ) > ( ) , then ( ) − < ( ) − ℎ or in other words the proposed method was more robust than Desoky's one.

Security Analysis
Since in Desoky's method the security only depends on the map used for transforming the character into the binary code, then the security evaluation was conducted based on the probability of obtaining the map. Suppose there are characters that had to be map to codes by assuming that each character had to be map to a unique code, then there were ! possible maps.
Thus, if an attacker tried to attack Desoky's method, the probability for obtaining the correct map was 1 ! . In the case of the proposed method, instead of only guessing the map, the alias guessing was also necessary. Since there were two possible alias types (rules A and B), then the probability for obtaining the correct map of the proposed method was 1 ! * 2 .
Instead of evaluating the security based on statistics, we also evaluated the security based on the security model [17]. Since the hidden message ( ) was not determined by the stego-text ( ), then ( | ) = ( ) (where ( ) is the uncertainty of E). Suppose the attacker knew and (source of cover), but the attacker could not obtain , because the stego-text was noiseless which means there was no difference between the stego-text and the cover. Thus, the uncertainty about if the knowledge of and was obtain, ( |( , )) = ( ) or it can be concluded that ( |( , )) = ( | ) = ( ).
The proposed method used secret XOR starting location which was depended on the number of codes in the map, which was assigned as the secret . Furthermore, based on [17], the uncertainty about if the knowledge of and is obtain, ( |( , ) = ( ) + ( | ). The uncertainty about if the knowledge of was obtain, ( | ) = ( ). Since was not determined by , then ( |( , ) = ( ) + ( ) which was greater than ( ). Since, ( |( , )) = ( | ) = ( ) and ( |( , ) ≥ ( ) , then it was proven that the proposed method was information theoritically secure.

Time Complexity
Since our proposed method used a function to generate the secret XOR starting location based on modular exponentiation, then the time complexity of our proposed method is O(2 n ), while Desoky's method is O(n).

Conclusion
The weakness of Headstega [12] is that it has less embedding capacity and it raises adversary's suspicion. Based on the experiment result, it is shown that the proposed method can improve the embedding capacity while preserving the security of Headstega by introducing the bitwise operation process. The result shows that the capacity of the proposed method is 89% larger than the previous method. Moreover, for embedding similar messages the previous method required 5% number of email address larger than using the proposed method. For reducing the adversary's suspicion, the existing email address is used instead of generating a new email address.
The imperceptibility of the proposed method is similar to the Desoky's one, while the robustness of the proposed method is better than Desoky's one. Furthermore, the security of the proposed method is better than Desoky's method.