archive 2019

Explorations in Next Generation Internet
NGI Forward focuses on identifying and evaluating the key enabling technologies and topics that will underpin the Next Generation Internet.
Forward's three key pillars:

Developing a cutting-edge data-driven methodology for identifying early signals of new trends & technologies.

Mapping the ecosystems & networks surrounding these key topics, evaluating their social, legal, technological, ethical & economic contexts.

Creating a value-driven vision for what the future internet could and should look like, involving a wide variety of voices across Europe.


Unique terms: 0+

Media articles: 0
Scientific articles: 0

Analysis period: three years and four months

Why ArXiv and SSRN?

  • Lengthy publication process in scientific journals
  • Broad coverage: SSRN's e-Library provides over 800,000 research papers from 400,000 researchers across 30 disciplines, ArXiv provides open access to almost 1,500,000 e-prints mostly in STEM fields

Project Goal and General Idea

  • Major aim is to identify key technologies determining the development of Internet until 2025
  • Strong focus on the relationship between technological areas and social issues
  • Data-driven approach with heterogeneous sources of data

Trend analysis

  • Analysis based on the frequency of appearances for all unigrams and bigrams in the texts
  • Average monthly change in the analysed term's frequency is calculated by OLS regressions
  • The coefficient reveals the trending unigrams and bigrams

Co-occurrence analysis

  • Exploring the relationship between topics
  • Pairs of terms which are mentioned together in media articles
  • The number of articles containing both terms is divided by the number of articles including our previously identified keyword of interest for every media website

Issue mapping

  • Articles are categorised across two dimensions: geography (EU vs US) and covered topic (social vs technological)
  • Words are ranked based on their frequency in articles classified as social and non-social (technological)

Main Programming Tools

Topic identification

most trending NGI related keywords are identified

Grouped into wider areas
The size of the bubble is based on the regression coefficient
Bigger bubble: more robust trend


The goal is to dive deeper in emerging technologies'
Relationship between social issues and technology
Hover on co-occurrence tree to prevent it from scrolling automatically

Umbrella topics

Hover on a topic to show keywords and a short description

AI & machine
of Things
Blockchain &
AI & Machine-learning

Artificial Intelligence and machine learning algorithms are among the most important computer science fields, with huge social implications. The top trending terms include both specific algorithms (e.g. reinforcement learning), tools (e.g. PyTorch) and also various controversial implementations, as deep fakes or Google’s project Maven. Moreover, AI and ML may be crucial in solving many social challenges, as in the case of the content crisis on social media.

Internet of Things

Internet of Things, along with various related technologies (AR/VR), has large potential to transform consumer electronics and production systems as well (industrial IoT). On the other hand, IoT devices raise cybersecurity and privacy concerns (e.g. smart speakers).

Blockchain & cryptocurrencies

Blockchain has been long regarded as a transformative technology with large disruptive potential. Blockchain technologies may play a central role in the future of social media, financial services and in other intermediation services. As of today, the most widespread implementation of blockchain is related to cryptocurrencies. As an emerging technology, blockchain raises pressing regulatory issues.

Quantum computing

Quantum computing, although there are promising developments, is not likely to become a mature technology in the next few years. However, quantum computing provides an opportunity for Europe to regain its competitive edge in advanced technologies. Therefore, mapping of quantum technology areas and developments has large value added.

Ethical AI & ML

As discussions about the potential transformative impact of AI and Machine Learning have come to dominate public debate in recent years, so have concerns about the potential negative side-effects of allowing these kinds of technologies to play an ever-larger role in decision-making and the governing of our societies. The development of ethical AI and ML tools doesn’t only involve the use of responsibly managed data (make sure we have a representative sample, privacy and anonymity is ensured) and algorithms that don’t further existing societal biases (around gender and ethnicity, for example), but also that the tools themselves are used for purposes we consider ethically just. Ensuring we have solutions that are fair and inclusive along the value chain (from data generation to the impact of the decisions being made or tasks replaced).

Internet regulation

Europe has been at the forefront of online regulations with GDPR, while the copyright directive (especially Article 11 and 13) has been more polarising among stakeholders. In the US, recent discussion has been focused on online content and Section 230 (platforms are not liable for the user generated content) or the controversial repeal of net neutrality rules.

Social media & content crisis

The spread of fake news, misinformation and the decline of trust in reliable sources create a profound challenge for the functioning of democracies and societies. While regulating platforms or implementing advanced topic filtering algorithms are among possible solutions, bringing back trust to written words may be far more complicated.

Market competition

The giants of digital economy (GAFA: Google, Amazon, Facebook and Apple) are all functioning as platforms with incredible market power. While the US has been less active in regulating market competition, e.g. in the case of Facebook acquisition of rival Instagram and Whatsapp, the EU is leading the discussion on ensuring competition in the Digital Single Market.

Chinese tech sector

China has managed to build a vibrant ecosystem in such key technologies as AI or 5G. The increasing position of the Chinese tech sector has brought a momentous challenge for both Europe and the US. China may be the forerunner in developing advanced AI systems and 5G networks, while advocating an approach to citizen rights and privacy that is in stark contrast to European values.

Social media &
content crisis


AI China Competition Crypto Internet regulation IoT Content crisis

Issue mapping

Articles are classified in two dimensions: EU/US, social issue/technology

EU axis: articles from European sources or concerning Europe, residualized on the social issues axis
Social issues axis: articles containing a sufficient number of words weighted by their inverse frequency from a pre-defined list of social topics based on Latent Dirichlet Allocation
Mapping trending words with article type based on number of occurrences
Top right corner: EU articles on social issues
Bottom left corner: US articles on technology


The sentiment analysis resulted in a compound score for every paragraph containing a given phrase. The score is calculated from the mean of the valence scores of each word in the paragraph apart from the analysed words themselves, which have been removed from the paragraph's text
project maven
facial recognition
iot tech
quantum computing
quantum technology
project maven

Most positiveMost negative
ethical principlesmilitary drones
neural networkskiller robots
recognition softwaresexual harassment
government contractsautonomous weapons
defense innovationlethal autonomous

facial recognition

Most positiveMost negative
voice assistantborder guards
ai technologyautonomous weapons
ai researchproject maven
edge computingbig brother
ai startupracial bias

iot tech

Most positiveMost negative
industrial iotsecurity concerns
emerging technologiessecurity threats
edge computingcyber security
cloud vendors5g networks
enterprise technologynetwork equipment


Most positiveMost negative
digital twincyber security
machine-learning modelsecurity risks
edge devicessubscription price
industrial automationrandom access
smart speakerreduce latency


Most positiveMost negative
renewable energychild abuse
enterprise technologyalex jones
edge computingcryptocurrency wallet
startup battlefieldconspiracy theories
decentralized appsponzi scheme

quantum computing

Most positiveMost negative
reinforcement learninglogic qubit
chinese researchersfive eyes
first demonstrationglobal trade
information sciencestrump administration
nobel prizequantum supremacy

quantum technology

Most positiveMost negative
50 qubitsfacebook users
quantum networkcambridge analytica
european aile maire
ai researchdata breach
information sciencessk telecom

chinese tech
chinese telecom
chinese government
huawei equipment
chinese tech

Most positiveMost negative
ai platformuncle sam
ai technologymeng wanzhou
ai applicationsfounder ren
ai algorithmshuawei employees
neural networkshuawei products

chinese telecom

Most positiveMost negative
oversight boardhuawei cfo
user datasanction law
huawei cyberus sanctions
security evaluationexport ban
5g securitytrade talks

chinese government

Most positiveMost negative
former intelligenceus sanctions
quantum computingagainst iran
neural networkssanctions against
facial recognitionhuawei employees
china mobilestealing trade

huawei equipment

Most positiveMost negative
deploy 5gstealing trade
security councilcriminal charges
oversight boardguo ping
chinese equipmentmeng wanzhou
five eyesagainst huawei

conspiraci theory
russian trolls
conspiraci theory

Most positiveMost negative
cyber securitymass shooting
battery replacementwhite supremacist
recommendation algorithmssandy hook
tech platformsalex jones
brexit referendumyoutube kids

russian trolls

Most positiveMost negative
ads purchasedfederal news
facebook revealedsow discord
collusion betweentroll factory
facebook spokespersonrussian meddling
project lakhtainformation warfare


Most positiveMost negative
jonathan zittrainupcoming election
neural networkswatchdog organ
election interferencebrazilian election
facebook ceofact-checking operation
political debatewhatsapp groups

tech policy
competition policy
fined google
tech giants
digital taxation

Most positiveMost negative
coin offeringsremove illegal
privacy regulatorschild sexual
civil rightsconsent decree
federal privacyfalse information
data portabilitysurveillance capitalism

tech policy

Most positiveMost negative
richard blumenthalsheryl sandberg
ai technologygoogle employees
protection regulationneutrality bill
chinese governmentuser privacy
general datacivil rights

competition policy

Most positiveMost negative
french presidentantitrust laws
big techmarket power
fake newsmedia platforms
digital marketingcommission found
elizabeth warrentargeted ads

fined google

Most positiveMost negative
antitrust decisionscommission slapped
privacy regulatorssurveillance capitalism
search approger mcnamee
contractual restrictionsshoshana zuboff
big brotherrecord fine

tech giants

Most positiveMost negative
renewable energychild sexual
neural networksremove illegal
ai systemsalex jones
voice assistantterrorist content
quantum computingconspiracy theories

digital taxation

Most positiveMost negative
global playereuropean council
eu leadersglobal solutions
emmanuel macronservices tax
digital revenuesglobal revenue
international solutioneconomic co-operation

section 230
neutrality laws
copyright directive
section 230

Most positiveMost negative
political adschild exploitation
cambridge analyticausers upload
google newsremove content
user datacongressional hearings
fake newssex trafficking

neutrality laws

Most positiveMost negative
5g networkspreempt state
jeff sessionsstate attorneys
repeal netinternet regulation
xavier becerracalifornia law
california attorneyinformation service

copyright directive

Most positiveMost negative
european creatorautomated surveillance
fair remunercommittee voted
creative contentrecognition technology
# saveyourinternetsusan wojcicki
european publisherskey legisl


Most positiveMost negative
voice assistantbritish airways
ai researcharticle 11
digital marketingelectronic health
third-party datafacebook data
process dataplain text


Topic modelling

Click to see topic modelling deliverable

This study presents an innovative methodology for analysing technology news using various text mining methods. News articles provide a rich source of information to track promising emerging technologies, relevant social challenges or policy issues. Our goal is to support the Next Generation Internet initiative by providing data science tools to map and analyse the developments of the tech word. Based on more than 200 000 articles from major media outlets, we are going to:

To meet these goals, a number of machine learning techniques are combined. The major steps can be summarised as follows:

  1. 17 general umbrella topics are explored
  2. 5 topics are selected for further analysis
  3. Deep dives are presented with 2D interactive maps

The topics selected for the deep dives are:

  1. AI and Robots
  2. Policy (sums up 3 relevant areas)
  3. Media
  4. Business
  5. Cybersecurity

The Policy topic groups together 3 areas: Social media crisis, Privacy and 5G.

Wide areas selected for deep-dive analyses

The 17 umbrella topics are identified using the topic modelling technique Latent Dirichlet Allocation. Besides the topics selected for deep dives, such areas are highlighted as Smartphones, CPU and other hardware, Digital ecosystems or Space.

Next, various maps are created based on the t-SNE algorithm. The example below presents the news stories in two-dimensions: articles that report on the same subject are clustered together. We demonstrate that this technique is highly useful to discover more narrow, domain-specific areas within the umbrella topics. Moreover, the distance between clusters is also meaningful, enabling the analysis of relationships between topics as well.

As an example, within the AI and robots topic, the map reveals groups of articles focused on such issues as:

It is also visible that articles on social and ethical issues are closer to each other, while articles on AI in self-driving cars are placed near business news on ride-sharing apps. It shows that our methodology is efficient in decreasing the complexity of text data, enabling to analyse and map topics.

All maps are interactive, inviting users to explore the headline of articles. Click here for an interactive version.

Example articles

Social and economic aspects of robotics

Ethical aspects of AI

Self-driving cars technologies

The presented methodology provides intuitive, easily understandable results. To enhance the exploration of results, the study is presented as an interactive guide. This report has been designed with different readers in mind, offering various journeys. To analyse and understand the results, it is sufficient to read the introduction and results sections. We also prepared a guide briefly explaining various text-mining methods for anyone interested. Finally, detailed description of methods are included for proper reproducibility of the study in the methods section.

Click to see topic modelling deliverable


NGI Forward has received funding from the European Union's Horizon 2020 research and innovation programme under the Grant Agreement no 825652. The content of this website does not represent the opinion of the European Union, and the European Union is not responsible for any use that might be made of such content.

Zenodo: data GitLab: codes

Toggle presentation mode
Click to show extended description