FAQs - LightScope

General Questions

LightScope is a cybersecurity research project that examines unwanted traffic on the internet. It is based upon work supported by the U.S. National Science Foundation under Grant No. 2313998 and the University of Southern California Information Sciences Institute.

Specifically, LightScope is software that turns closed ports on your live production machine into honeypots or network telescopes. It then examines the traffic that scanners and attackers have sent to learn about how they interact with you.

Research topics we examine include trends in AI/ML-based unwanted traffic, identifying and attributing large-scale distributed campaings, and how scanners and attackers interact with live/production machines as opposed to honeypots or network telescopes.

In order to give back to those who deploy the LightScope software on their systems, we provide free general threat intelligence to the public, and free cutomized threat intelligence to those who deploy LightScope.

We are interested in the unwanted traffic that scanners and attackers send to your closed ports. We are not interested in legitimate traffic between you and your users. As a result, we have specifically designed LightScope not to collect any "wanted" traffic, or any information that may identify you or your users.

For example, if you are running a web server on port 80, LightScope will see that port 80 is open and not collect any data going to or from that port. Our study has passed IRB approval verifying our collection and storage methods (certified exempt), as study UP-25-00124 --- LightScope - Survey of unwanted traffic to large user populations to the University of Southern California Institutional Review Board.

Our code is open source, written in python (specifically to provide transparency), and our study was reviewed by the University of Southern California IRB and found not to generate identifiable private information.

Specific Data Collected

We collect the following fields from packets without modification, which cannot be used to identify participants in our study:

Time

Timestamp information for temporal analysis.

IPv4 Fields

Which will not allow us to identify participants, and include:

Version, IHL (Internet Header Length), TOS (Type of Service)
Len (Length), ID (Identification), Flags, Frag (Fragment)
TTL (Time to Live), Proto (Protocol), Chksum (Checksum)
Src (Source), Options (including padding)

TCP Fields

Which will not allow us to identify participants, and include:

Sport (Source Port), Dport (Destination Port)
Seq (Sequence Number), Ack (Acknowledgment Number)
Dataofs (Data Offset), Reserved, Flags, Window
Chksum (Checksum), Urgptr (Urgent Pointer), Options

IPv6 Fields

Required to analyze the newest traffic protocol. These fields will not allow us to identify participants, and include:

Version, Traffic class, Flow label
Payload length, Next header, Hop Limit, Source address

UDP Fields

Required to analyze UDP traffic. These fields will not allow us to identify participants, and include:

Source Port, Destination Port, Length, Checksum

ICMP Messages

Internet Control Message Protocol messages, which can indicate which UDP ports are closed, and can be another type of unwanted traffic in themselves. These messages will not allow us to identify participants.

ARP Messages

Address Resolution Protocol messages, which will allow us to infer which machines are on the local network, and which machines are remote. These messages will not allow us to identify participants.

Anonymization and Privacy

IP Destination Field Anonymization

We also collect and anonymize the IP destination field from the packets, as it could be used to identify participants in our study if captured without modification. We perform anonymization by randomizing the IPs in a consistent manner where the IP addresses have a 1 to 1 mapping with the anonymized values, but we cannot reverse this anonymization.

Network Type and Country Inference

We infer the Network Type and Country of the participants from their packets. This information cannot be used to identify participants in our study.

System Information

Machine Information

Information about the user's machine, which will help us determine the most likely use of the machines enrolled in our study. An example would be a web server vs a laptop, which we expect to have significantly different profiles. These fields will not allow us to identify participants, and include:

System Info (Operating system), Release Info, Version Info
Machine Info, Total memory, Processor, Architecture
Ports open, Network interfaces used
If their internal IP is private, If their external IP is private
Time, Number of TCP packets LightScope inspected

Network Classification

Lastly, we infer ASN, ASN type, and city of the participants from their network traffic. This information helps us categorize the type of network we are monitoring, which we believe should have an impact on the amount and type of traffic we observe. This information cannot be used to identify participants in our study.

Yes, LightScope is an open-source project developed under NSF grant #2313998. The source code is available on GitHub under the MIT license, allowing for both academic and commercial use with proper attribution.

We encourage contributions from the community to help improve and extend the capabilities of LightScope.

LightScope observes traffic sent to closed ports on production machines, while honeypots and network telescopes operate quite differently:

Honeypots vs. LightScope:

Honeypots are dedicated machines that usually don't run production services. They are designed to attract and trap attackers by simulating vulnerable systems.
Attackers actively attempt to avoid honeypots, which can significantly impact what we can learn from them. Modern attackers use various techniques to identify and bypass honeypots.
LightScope monitors real production systems, capturing authentic attacker behavior as they interact with legitimate infrastructure.

Network Telescopes vs. LightScope:

Network telescopes typically monitor unused IP address space, capturing traffic sent to addresses that shouldn't receive any legitimate traffic.
LightScope focuses on live, production endpoints that are actively used for legitimate purposes, providing insights into how attackers interact with real systems.

Key Advantages of LightScope:

Authentic Attack Patterns: By monitoring production systems, LightScope captures genuine attacker behavior without the artificial environment that honeypots create.
Real-world Context: Understanding how attackers target and interact with actual production infrastructure provides more actionable threat intelligence.
Evasion-resistant: Since LightScope monitors legitimate systems, attackers cannot easily identify and avoid them like they can with honeypots.
Comprehensive Coverage: Captures the full spectrum of scanning and attack activity against real infrastructure, not just traffic to unused space.
Operational Relevance: The data collected directly reflects threats to actual production environments, making it more relevant for defensive planning.

Why This Matters: Getting attacker and scanner data from live production endpoints is crucial because it provides the most accurate picture of actual threats. This real-world data helps organizations better understand their threat landscape and develop more effective security strategies based on how attackers actually behave when targeting legitimate infrastructure.

Your personalized LightScope dashboard URL is displayed during installation and is always available through systemctl status.

Easiest Method - SystemCtl Status:

Run this command on your LightScope server to see your dashboard URL:


                      $ sudo systemctl status lightscope

Look for your dashboard URL in the "Docs:" section:

● lightscope.service - LightScope Network Security Monitor
     Loaded: loaded (/usr/lib/systemd/system/lightscope.service; enabled; preset: enabled)
     Drop-In: /etc/systemd/system/lightscope.service.d
              └─database-name.conf
     Active: active (running) since Sun 2025-06-29 05:17:31 UTC; 50s ago
       Docs: https://thelightscope.com
              https://thelightscope.com/tables/20250629_gvzdkbinpryhdrszsdzufpoeejxmoyhngrrjrjrxodfsuwf

🎯 Simply copy the second Docs URL (the one containing /tables/) and paste it into your browser!

Alternative Methods:

Web Interface

Visit the dashboard landing page and enter your database name:

https://lightscope.isi.edu/tables

Direct Config Check

Read the database name from config file:

sudo cat /opt/lightscope/config/config.ini

Pro Tip:

Your database name follows the format YYYYMMDD_[random_letters] and is unique to your LightScope installation. The dashboard provides personalized threat intelligence based on the traffic observed by your specific endpoint.

Security & Safety

No, LightScope will NOT make your system more vulnerable to attacks.

The honeypot runs on remote servers managed by USC, not on your local machine. Your LightScope client only forwards traffic data - it doesn't run vulnerable services locally.

LightScope works with your existing firewall and network setup. No special configuration is required.

LightScope automatically configures itself based on your current system.

LightScope

Frequently Asked Questions