Machine Learning and Ruby: Enhancing the Cybersecurity Industry Part I

Did you know that in the first quarter of 2019, hackers exposed 4.1 billion private data records to the
public [1]? The future does not seem bright either. According to Gartner, worldwide security spending is
expected to exceed $133 billion in 2022 [2]. With a severe lack of talent in the field of cybersecurity,
how are we going to meet the predicted demand [3]? Recent technological advances may solve this
issue.

Computers over time have been outperforming themselves time and time again as Moore’s Law has
predicted [4]. This provides us many opportunities for technological exploration [5]. One avenue that is
revolutionizing industries worldwide is Machine Learning.

What is Machine Learning?

Machine Learning can be defined as follows:

Using data to solve problems [6].

Without data, the machine can not learn. Just like humans, we need to experience the world to learn
how to live in it. For example, we learn how to walk by trying to walk. We try one way, if it does not
work, we try another way. If this way works, we remember it and try to repeat it. Machines, albeit not
conscious, learn in a similar way.

Machine Learning encompasses three main steps:

1. Input
2. Analysis & Computation
3. Output

1. Input

Machines need data in order to learn. They need a lot of data! In fact, millions of data points, depending
on what is trying to be accomplished, may not be enough. All this data must be input into the system for
analysis and computation.

2. Analysis & Computation

A machine learning system first needs to have a goal. For example, we want a machine to determine if a
patient has a detached retina. The builder of the system inputs retinal scans that show a detached
retina. The machine learns how to spot this and “remembers it.” The machine then takes a massive
dataset and compares it with this use case. It then concludes based on its inputs.

3. Output

The results that the machine produces may or may not be correct. If it is not correct, it uses a feedback
system to correct its next run. If it is correct, it reinforces its results in the next run.

Machine Learning is a powerful concept but requires a programming language for it to be implemented.

What is Ruby?

Ruby is a programming language that is “dynamic [and] open source . . . with a focus on simplicity and
productivity [7].” It was developed by Yukihiro Matsumoto in the mid-1990s [8]. It has gained popularity
over the years and has become the 11th most popular programming language (January 2020) [9]. With
its’ focus on simplicity and productivity, Ruby has become a good choice for machine learning projects.

There are a vast number of resources available for using machine learning in Ruby [10]. During this
series, however, I will be focusing on the machine learning side of cybersecurity analytics.
Enhancing the Cybersecurity Industry

There is an increasing demand for cybersecurity professionals [11]. As with other industries [12], the
cybersecurity space is looking for ways to remediate this issue. Companies like Microsoft (Azure
machine-learning) [13] and Palo Alto Networks (Next-Generation Firewall) [14] are already using
machine learning to improve its detection of phishing emails and malware, respectively. How else can
we use machine learning in this sphere?

During this series, I propose that machine learning can be used to sift through security and audit logs to
determine suspicious user activity. This activity can then be brought to a security professional to be
further analyzed. In this way, security professionals are not wasting their time on irrelevant log checks.

References

[1] “2019 MidYear QuickView Data Breach Report”. [Online]. Available:
https://pages.riskbasedsecurity.com/2019-midyear-data-breach-quickview-report. [Accessed: 25-Jan.-
2020].

[2] “Gartner Forecasts Worldwide Information Security Spending to …”. [Online]. Available:
https://www.gartner.com/en/newsroom/press-releases/2018-08-15-gartner-forecasts-worldwide-
information-security-spending-to-exceed-124-billion-in-2019. [Accessed: 25-Jan.-2020].

[3] “110 Must-Know Cybersecurity Statistics for 2020 | Varonis”. [Online]. Available:
https://www.varonis.com/blog/cybersecurity-statistics/. [Accessed: 25-Jan.-2020].

[4] “Moore’s law – Wikipedia”. [Online]. Available: https://en.wikipedia.org/wiki/Moore%27s_law.
[Accessed: 25-Jan.-2020].

[5] “Technology exploration – Clever Franke”. [Online]. Available:
https://www.cleverfranke.com/technology-exploration. [Accessed: 25-Jan.-2020].

[6] “What is Machine Learning? A definition – Expert System”. [Online]. Available:
https://expertsystem.com/machine-learning-definition/. [Accessed: 25-Jan.-2020].

[7] “Ruby Programming Language”. [Online]. Available: https://www.ruby-lang.org/en/. [Accessed: 25-
Jan.-2020].

[8] “An introduction to Ruby Programming: the history of Ruby.”. [Online]. Available:
https://launchschool.com/books/ruby/read/introduction. [Accessed: 25-Jan.-2020].

[9] “index | TIOBE – The Software Quality Company”. [Online]. Available: https://www.tiobe.com/tiobe-
index/. [Accessed: 25-Jan.-2020].

[10] “Resources for Machine Learning in Ruby – gists · GitHub”. [Online]. Available:
https://gist.github.com/gbuesing/865b814d312f46775cda. [Accessed: 25-Jan.-2020].

[11] “Top 8 in-demand cybersecurity jobs in 2020 | EC-Council Official Blog”. [Online]. Available:
https://blog.eccouncil.org/top-8-in-demand-cybersecurity-jobs-in-2020/. [Accessed: 25-Jan.-2020].

[12] “Will a robot take my job? | The Age of A.I. – YouTube”. [Online]. Available:
https://www.youtube.com/watch?v=f2aocKWrPG8. [Accessed: 25-Jan.-2020].

[13] “Microsoft Azure: Cloud Computing Services”. [Online]. Available: https://azure.microsoft.com/en-
us/. [Accessed: 25-Jan.-2020].

[14] “PA-220 – Next-Gen Firewall – Palo Alto Networks”. [Online]. Available:
https://www.paloaltonetworks.com/network-security/next-generation-firewall/pa-220. [Accessed: 25-
Jan.-2020].