A little more casual cse cic ids2018 on AWS

 Before NIDS is formally deployed on the network, it needs a lot of testing, evaluation and adjustment. This requires us to use the appropriate data set for these contents. However, there are two main problems in current data acquisition: (1) many data sets are internal non-public and cannot be shared. (2) The anonymity of data makes them unable to reflect the current situation, trend or some statistical characteristics.

 

 

So we need to find some sub optimal data sets that meet our requirements. However, with the change of network behavior and pattern and the development of intrusion, the research needs gradually shift from static and one-time data sets to dynamically generated data sets. They can not only reflect the traffic composition and intrusion at that time, but also have scalability and replicability, that is, they can be modified.

 

 


  CSE-CIC-IDS2017/2018 on  AWS data set is a collaborative project between CSE and CIC. It is based on creating user profiles to generate diverse and comprehensive benchmark data sets for intrusion detection. The profiles contain abstract representations of events and behaviors seen on the network, and the configuration files are combined to generate a set of different data sets, each of which can be used for intrusion detection Each dataset has a unique set of functions that can cover a part of the evaluation domain.

 

 

Attack infrastructure: 50 computers, the victim organization has five departments, including 420 computers and 30 servers.

 

 

Datasets: capturingNetwork traffic and system logs of each computer, and 80 functions extracted from captured traffic using cicflow meter-v3.

 

 

 

 

 

 first, configuration file

 

 

 configuration file contains detailed description of intrusion and abstract distribution model for application, protocol or underlying network entity, which can be applied to various network protocols with different topologies. Configuration files and profiles can spotoclub  be used together to generate data sets for specific requirements.

 

 

Two categories:

 

 

 1. B-profiles: contains the abstract behavior of users. Various machine learning and statistical analysis techniques (such as k-means, random forest, SVM and j48) are used to encapsulate the user's entity  aws certification practice test  behavior. Encapsulation is characterized by the packet size of the protocol, the number of packets per stream, some patterns in the payload, the size of the payload and the distribution of the request time of the protocol. In the test platform environment, the simulated protocols are: HTTPS, HTTP, SMTP, POP3, IMAP, SSH and FTP. During the test, most of the traffic was HTTP and HTTPS.

 

 


2. M-profiles: describe the attack in a clear way. Once you understand these attacks, you can use the configuration file and execute it. The details are shown in Table 1.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 2. Attack scenarios

 

 

 

 this dataset contains seven different attack scenarios:

 

 

 1Brute force (brute force attack)

 

 

 uses a weak combination of user name and password to break into an account. The design goal of the final scheme is to obtain SSH and MySQL accounts by running dictionary brute force attacks on the main server.

 

 

In this dataset, FTP and SSH on Kali Linux computer are used as the attacker's computer, and Ubuntu 14.0 system is used as the victim's computer. For the password list, a large dictionary containing 90 million words was used. Recommended cracking tool: patator (fully multithreaded), written in Python, is more reliable and flexible. Each response can be saved in a separate log file for later viewing.

 

 

 

 

 

 2. Heartbleed is one of the famous last updated attacks (attacks based on certain vulnerabilities can be executed in a specific time, which sometimes affect millions of servers or victims, and usually take months to fix all the vulnerabilities). Heartleech is one of the most famous tools for developing heartbled. It can scan systems that are vulnerable to the error and then use it to exploit and steal data. OpenSSL version 1.0.1f is used as the victim application.

 

 

Attachment: some functions of heartleech

 

 

 about whether the goal is easy or notAttacking conclusive / non conclusive adjudication

 download a large amount of obnoxious data into a large file quickly, so that many threads can be used for offline processing

 automatically retrieve the private key without other steps

 some limited IDS avoidance

 starttls support

 IPv6 support

 tor / socks5n proxy support

 extensive connection diagnosis information

 

 

 

 

 

3. Botnet (botnet)

 

 

 uses Zeus, a Trojan horse malware package running on Microsoft Windows version, which is usually used to steal bank information through keystroke records and forms in the browser, and can also be used to install crypto locker blackmail software. Zeus is mainly spread through stowaway downloads and phishing programs. As a supplement, Ares Botnet, an open source Botnet, has the following functions:

 


 

 remote control cmd.exe Shell

 persevere

 file upload / download

 screen capture

 key record

 

 

 this dataset uses the above two different botnets to infect computers, and requests screen capture from botnets every 400 seconds.

 

 

 

 

 

 4. DOS (denial of service attack) & 5. DDoS (distributed denial of service attack)

 

 

 HTTP denial of service attack: using slowloris and Loic as the main tools, these tools have been proved to be able to use a single attacker to make WThe EB server is completely inaccessible. Slowloris enables one computer to shut down another's web server with minimal bandwidth and side effects on unrelated services and ports. First, establish a complete TCP connection with the remote server. The tool keeps the connection open by sending valid, incomplete HTTP requests to the server on a regular basis to prevent the socket from closing. Since any web server has limited ability to connect to services, it is only a matter of time before all sockets are exhausted and no other connections can be established.

 

 

Hoic is an open source network stress testing and denial of  aws certified solutions architect associate exam  service attack application written in basic. It can launch DoS attacks on websites, aiming at attacking up to 256 URLs at the same time. This data set uses 4 computers to attack DDoS.

 

 

 

 

 

  6.Web  Attacks (Web attacks)

 

 

 using web applications (DVWA) as victim web applications, the main goal of DVWA is to help security professionals test their skills and tools in a legal environment, help web developers better understand the process of protecting web applications, and help teachers / students teach / learn web application security in a classroom environment It is also vulnerable to attack. The first step is to scan the website through the web application vulnerability scanner, and then carry out different types of Web attacks on the vulnerable websites, including SQL injection, command injection and attackUnlimited file upload.

 

 

7. Infiltration of the network

 

 

 send malicious files to victims via e-mail and exploit application vulnerabilities. After successful use, the back door will be executed on the victim's computer, and his computer will be used to scan other vulnerable applications in the internal network, and use them when possible. Attacks include IP scanning, full port scanning and service enumeration using nmap.

 

 

 

 

 

 Third, feature extraction

 

 

 the tool used here is cicflow meter, which is a network traffic flow generator written in Java. It provides greater flexibility in selecting functions to be calculated, adding new functions and better controlling the flow timeout duration. It generates a biflow, where the first packet determines the forward (from source to destination) and reverse (destination to source) directions. It has 83 statistical functions, such as duration, number of packets, number of bytes, length of packets, etc., which are also calculated in the forward and reverse direction respectively.

 

 

The output of the application program is in CSV file format, and each stream has six column marks, which are flowid, sourceip, destinationip, sourceport, destinationport and protocols with more than 80 network traffic functions. Typically, TCP flows terminate when the connection is disconnected (via fin packets)And the UDP stream terminates when the stream timeout. The flow timeout value can be arbitrarily allocated by each scheme. For example, for TCP and UDP, it is 600 seconds.

 

 

 

After extracting features and creating a CSV file, label the data. Here, the attack schedule, IP and port of source and target, protocol name are used to mark the data of each flow.

 

 

How to use it?

 

 

Data sets are organized by day. Daily record of raw data, including each computer's network traffic (PCAPs) and event log (windows and Ubuntu event log). In the process of feature extraction from the original data, cicflow meter-v3 is used to extract more than 80 traffic features and save them as a CSV file of each computer.

 

 

1. Using AI technology to analyze: you can download the generated data (CSV) file and analyze the network traffic.

 

 

2. To use a new feature extractor: you can use the original captured files (pcap and log) to extract the features you need. Then, data mining technology is used to analyze the generated data.

 

 

 

 

 

 finally, cse-cic-ids2018 on AWS is attached https://www.unb.ca/cic/datasets/ids-2018.html

 

 

 

 

Comments

Popular posts from this blog

Perforated Aluminum Sheet, Panel and Plate Service

Mental Relaxation Progressive Muscle Relaxation

The Best Toilet Suites Place