A little more casual cse cic ids2018 on AWS
Before NIDS is formally deployed on the network, it needs a lot of testing, evaluation and adjustment. This requires us to use the appropriate data set for these contents. However, there are two main problems in current data acquisition: (1) many data sets are internal non-public and cannot be shared. (2) The anonymity of data makes them unable to reflect the current situation, trend or some statistical characteristics.
So we need to find some sub optimal data sets that meet our
requirements. However, with the change of network behavior and pattern and the
development of intrusion, the research needs gradually shift from static and
one-time data sets to dynamically generated data sets. They can not only
reflect the traffic composition and intrusion at that time, but also have
scalability and replicability, that is, they can be modified.
CSE-CIC-IDS2017/2018
on AWS data set is a collaborative
project between CSE and CIC. It is based on creating user profiles to generate
diverse and comprehensive benchmark data sets for intrusion detection. The
profiles contain abstract representations of events and behaviors seen on the
network, and the configuration files are combined to generate a set of
different data sets, each of which can be used for intrusion detection Each
dataset has a unique set of functions that can cover a part of the evaluation
domain.
Attack infrastructure: 50 computers, the victim organization has
five departments, including 420 computers and 30 servers.
Datasets: capturingNetwork traffic and system logs of each
computer, and 80 functions extracted from captured traffic using cicflow
meter-v3.
first, configuration file
configuration file contains
detailed description of intrusion and abstract distribution model for
application, protocol or underlying network entity, which can be applied to
various network protocols with different topologies. Configuration files and
profiles can spotoclub be used together to generate data sets for
specific requirements.
Two categories:
1. B-profiles: contains the
abstract behavior of users. Various machine learning and statistical analysis
techniques (such as k-means, random forest, SVM and j48) are used to
encapsulate the user's entity aws certification
practice test behavior.
Encapsulation is characterized by the packet size of the protocol, the number
of packets per stream, some patterns in the payload, the size of the payload
and the distribution of the request time of the protocol. In the test platform
environment, the simulated protocols are: HTTPS, HTTP, SMTP, POP3, IMAP, SSH
and FTP. During the test, most of the traffic was HTTP and HTTPS.
2. M-profiles: describe the attack in a clear way. Once you
understand these attacks, you can use the configuration file and execute it.
The details are shown in Table 1.
2. Attack scenarios
this dataset contains seven
different attack scenarios:
1Brute force (brute force
attack)
uses a weak combination of
user name and password to break into an account. The design goal of the final
scheme is to obtain SSH and MySQL accounts by running dictionary brute force
attacks on the main server.
In this dataset, FTP and SSH on Kali Linux computer are used as
the attacker's computer, and Ubuntu 14.0 system is used as the victim's
computer. For the password list, a large dictionary containing 90 million words
was used. Recommended cracking tool: patator (fully multithreaded), written in
Python, is more reliable and flexible. Each response can be saved in a separate
log file for later viewing.
2. Heartbleed is one of the
famous last updated attacks (attacks based on certain vulnerabilities can be
executed in a specific time, which sometimes affect millions of servers or
victims, and usually take months to fix all the vulnerabilities). Heartleech is
one of the most famous tools for developing heartbled. It can scan systems that
are vulnerable to the error and then use it to exploit and steal data. OpenSSL
version 1.0.1f is used as the victim application.
Attachment: some functions of heartleech
about whether the goal is
easy or notAttacking conclusive / non conclusive adjudication
download a large amount of
obnoxious data into a large file quickly, so that many threads can be used for
offline processing
automatically retrieve the
private key without other steps
some limited IDS avoidance
starttls support
IPv6 support
tor / socks5n proxy support
extensive connection
diagnosis information
3. Botnet (botnet)
uses Zeus, a Trojan horse
malware package running on Microsoft Windows version, which is usually used to
steal bank information through keystroke records and forms in the browser, and
can also be used to install crypto locker blackmail software. Zeus is mainly
spread through stowaway downloads and phishing programs. As a supplement, Ares
Botnet, an open source Botnet, has the following functions:
remote control cmd.exe
Shell
persevere
file upload / download
screen capture
key record
this dataset uses the above
two different botnets to infect computers, and requests screen capture from
botnets every 400 seconds.
4. DOS (denial of service
attack) & 5. DDoS (distributed denial of service attack)
HTTP denial of service
attack: using slowloris and Loic as the main tools, these tools have been
proved to be able to use a single attacker to make WThe EB server is completely
inaccessible. Slowloris enables one computer to shut down another's web server
with minimal bandwidth and side effects on unrelated services and ports. First,
establish a complete TCP connection with the remote server. The tool keeps the
connection open by sending valid, incomplete HTTP requests to the server on a
regular basis to prevent the socket from closing. Since any web server has
limited ability to connect to services, it is only a matter of time before all
sockets are exhausted and no other connections can be established.
Hoic is an open source network stress testing and denial of aws
certified solutions architect associate exam service attack application written in basic.
It can launch DoS attacks on websites, aiming at attacking up to 256 URLs at
the same time. This data set uses 4 computers to attack DDoS.
6.Web Attacks (Web attacks)
using web applications
(DVWA) as victim web applications, the main goal of DVWA is to help security
professionals test their skills and tools in a legal environment, help web
developers better understand the process of protecting web applications, and
help teachers / students teach / learn web application security in a classroom
environment It is also vulnerable to attack. The first step is to scan the
website through the web application vulnerability scanner, and then carry out
different types of Web attacks on the vulnerable websites, including SQL
injection, command injection and attackUnlimited file upload.
7. Infiltration of the network
send malicious files to
victims via e-mail and exploit application vulnerabilities. After successful
use, the back door will be executed on the victim's computer, and his computer
will be used to scan other vulnerable applications in the internal network, and
use them when possible. Attacks include IP scanning, full port scanning and
service enumeration using nmap.
Third, feature extraction
the tool used here is
cicflow meter, which is a network traffic flow generator written in Java. It
provides greater flexibility in selecting functions to be calculated, adding
new functions and better controlling the flow timeout duration. It generates a
biflow, where the first packet determines the forward (from source to
destination) and reverse (destination to source) directions. It has 83
statistical functions, such as duration, number of packets, number of bytes,
length of packets, etc., which are also calculated in the forward and reverse
direction respectively.
The output of the application program is in CSV file format, and
each stream has six column marks, which are flowid, sourceip, destinationip,
sourceport, destinationport and protocols with more than 80 network traffic
functions. Typically, TCP flows terminate when the connection is disconnected
(via fin packets)And the UDP stream terminates when the stream timeout. The
flow timeout value can be arbitrarily allocated by each scheme. For example,
for TCP and UDP, it is 600 seconds.
After extracting features and creating a CSV file, label the data.
Here, the attack schedule, IP and port of source and target, protocol name are
used to mark the data of each flow.
How to use it?
Data sets are organized by day. Daily record of raw data,
including each computer's network traffic (PCAPs) and event log (windows and
Ubuntu event log). In the process of feature extraction from the original data,
cicflow meter-v3 is used to extract more than 80 traffic features and save them
as a CSV file of each computer.
1. Using AI technology to analyze: you can download the generated
data (CSV) file and analyze the network traffic.
2. To use a new feature extractor: you can use the original captured
files (pcap and log) to extract the features you need. Then, data mining
technology is used to analyze the generated data.
finally, cse-cic-ids2018 on
AWS is attached https://www.unb.ca/cic/datasets/ids-2018.html
Comments
Post a Comment