In the first part of this series, I outlined the basics of how machine learning and Ruby can enhance the cybersecurity industry. In Part II, I will be diving into the important pieces that enable this enhancement.
To start, there are two types of learning algorithms that provide the foundation of machine learning:
Supervised learning provides labeled data to train the machine. The input data is classified and quantified and there are no unknowns about each data point. For example, we can feed the machine thousands of images as training data. The machine learns how to classify each training image based on each image’s metadata. When a non-training image enters the machine, it can classify it based on its “experience” with classifying previous images.
Unsupervised learning provides unlabeled data to train the machine. Unlike supervised learning, the machine must determine the classification of each image. For example, thousands of unlabeled images may enter the machine as training data. The machine must create its own classification criteria for this training set. The output is determined by how the machine decides to sort the images. Instead of sorting the images via title or location, the machine may decide to sort it by color, size, or even by person!
In learning these new technologies, I thought it best to compare them with the technologies we know.
First, Ruby is a dynamic programming language whereas Java is strongly-typed. Second, Ruby is also an interpreted language whereas Java must first be compiled before it is run. Third, all member variables in Ruby are declared private whereas member variables in Java can vary.
To help illustrate the differences, we will use a Turtle program written in both Ruby and Java!
The following code base outlines the differences between Java and Ruby classes:
public class Main { private boolean running; private Turtle turtle; private Scanner scanner; public Main() { running = true; turtle = new Turtle(); scanner = new Scanner(System.in); }
class Main # Constructor to initialize Turtle def initialize @running = true @turtle = Turtle.new end
As you can see, class declarations are quite similar. A Java constructor uses the name of the class whereas Ruby uses the initialize method name. In addition, Ruby can declare and initialize instance variables by using the @<variableName> syntax. Note that all methods in Ruby have to include the end to mark the scope.
Again, there are similarities between how Java and Ruby declare an if/else ladder:
if(choice.equals("l")) { System.out.println("Turtle has moved left!"); turtle.moveLeft(); } else if(choice.equals("r")) { System.out.println("Turtle has moved right!"); turtle.moveRight(); } else if(choice.equals("u")) { System.out.println("Turtle has moved up!"); turtle.moveUp(); } else if(choice.equals("d")) { System.out.println("Turtle has moved down!"); turtle.moveDown(); } else if(choice.equals("q")) { System.out.println("Goodbye!"); running = false; return; } else { System.out.println("Please enter a valid command!"); }
if choice.eql?('l') puts 'Turtle has moved left!' @turtle.move_left elsif choice.eql?('r') puts 'Turtle has moved right!' @turtle.move_right elsif choice.eql?('u') puts 'Turtle has moved up!' @turtle.move_up elsif choice.eql?('d') puts 'Turtle has moved down!' @turtle.move_down elsif choice.eql?('q') puts 'Goodbye!' @running = false return else puts 'Please enter a valid command!' end
The differences in the two languages are minor, but still noteworthy. Java compares string values with the .equals method whereas Ruby uses the .eql? method call. Also, notice that the “else if” condition in Ruby is declared elsif (missing the ‘e’). Ruby simplifies the System.out.println console output to the puts keyword.
It is mandatory in Java to have a main method to start the program. In Ruby, any class may act as the starting point for the application. For simplicity’s sake, we usually call the main method, well, main.
public static void main(String[] args) { Main main = new Main(); main.run(); }
# Calls the main program class m = Main.new m.run
In Ruby, we can create a new instance of a class and then call the method that will run the program (main in this case).
Both languages create and handle objects very similarily.
public class Turtle { // Instance variables private int x; private int y; private final String name; // Constructor public Turtle() { this.x = this.y = 0; this.name = "Turtle"; } // Moves the Turtle left public void moveLeft() { this.x -= 1; } // Moves the Turtle right public void moveRight() { this.x += 1; } // Moves the Turtle up public void moveUp() { this.y += 1; } // Moves the Turtle down public void moveDown() { this.y -= 1; } }
class Turtle # Instance Variables: Auto-generates Getters/Setters attr_accessor :x, :y, :name # Constructor def initialize self.x = self.y = 0 self.name = 'Turtle' end # Moves the Turtle left def move_left self.x -= 1 end # Moves the Turtle right def move_right self.x += 1 end # Moves the Turtle up def move_up self.y += 1 end # Moves the Turtle down def move_down self.y -= 1 end end
To conclude the comparisons, we will take a look at the getters and setters in both languages.
public int getX() { return x; } public void setX(int x) { this.x = x; } public int getY() { return y; } public void setY(int y) { this.y = y; } public String getName() { return name; }
# Instance Variables: Auto-generates Getters/Setters attr_accessor :x, :y, :name
As the comments suggest, using the attr_accessor keyword along with the ‘:’ prefix, Ruby can auto-generate getters and setters for the given instance variables.
An important aspect of cybersecurity is record keeping. When an incident occurs, there will need to be an investigation. Records provide evidence that aid in solving these investigations.
There are two main records, or logs, that we will cover here:
Security logs determine what happened whereas audit logs determine how something happened. At first glance, there does not seem to be much of a difference between the two. However, the below examples will clarify the differences.
Figure 1. Security Log [1].
Notice that each column is focused on providing exact details on what event occurred. Contrast this with the audit log example below.
Figure 2. Audit Logs [2].
Notice that each entry represents every action a user has done. This provides a “trail of breadcrumbs” for an analyst to determine how something occurred.
Let’s walk through a use case to see how one can use the logs in an investigation.
A user within your company has been logging in from various locations around the world. You hypothesize that they are using a VPN. However, you suspect that something more malicious is going on. You decide to sift through the logs to debunk your hypothesis. What would you look for? Login/logout times? IP ownership? Would you examine every log entry?
A user within your company has been logging in from various locations around the world. You hypothesize that they are using a VPN. However, you suspect that something more malicious is going on.
You decide to sift through the logs to debunk your hypothesis. What would you look for?
Login/logout times? IP ownership? Would you examine every log entry?
One can see that investigations are a lengthy process. Fortunately, we can use machine learning to aid in this cause.
By using Machine Learning and a programming language like Ruby, we can develop tools that can sift through millions of log entries to determine malicious activity. In the final part of this series, I will demonstrate how we can develop a tool to do this.
[1] https://help.fortinet.com/fadc/4-5-1/olh/Content/FortiADC/Images/security-log.png
[2] https://help.fortinet.com/coyotepoint/10-3-2/Content/Logging/gui_audit_log.gif