The Deeptech_50 is a Crunchbase overview of 50 Venture Capital (VC) firms operating in the 'deep tech' space. It is essential for these firms to be visible to both potential investors and promising startups.
To identify and study the Twitter sub-network in which the VC firms from the Deeptech 50 list are embedded.
The Deeptech 50 overview was pulled from Crunchbase. Out of the 50 VC firms, 45 provide Twitter handles on their Crunchbase profiles. These firms, referred to as the Deeptech 45, were included in the analysis. Twitter data was collected using the Tweepy API from December 20 to 28, 2020.
To identify the Twitter network in which the Deeptech 45 are embedded, we will first 'branch in' by identifying shared connections among the Deeptech 45 and then 'branch out' by mapping the network in which those shared connections are embedded. This method should delineate any Twitter community if the original seed population represents a distinct network.
The rationale is as follows: We are interested in a population 'A' (e.g., people interested in 'cats'), of which A1, A2, and A3 are members (e.g., three users with a cat in their profile picture). Note that in terms of connectivity, A1, A2, and A3 are not necessarily at the center of A. However, by mapping all Twitter accounts connected to A1, A2, and A3, we should be able to delineate the core of A (e.g., user profiles associated with famous cats or cat owners). After identifying the core of A, we then 'branch out' and map users connected to this core. This should provide a representative sample of the center of the network in which A1, A2, and A3 are embedded.
To define the network in which the Deeptech 50 are embedded, we first mapped the connections between the Twitter accounts of the Deeptech 45. We observed that almost no bidirectional connections exist among the firms; most connections are unilateral, with slightly less than half of the VC firms following First Round and/or Kleiner Perkins (Figures 1 and 2).
A similar pattern emerges when examining Twitter interactions. Figure 3 shows information flow within the Deeptech 45, analyzing retweets, quote tweets, responses, and '@' mentions from the last 1,000 tweets on each user's profile. A16z and First Round are major sources of information, but Kleiner Perkins, despite its many followers, is not a major source of interactions.
Next we looked at the network in which the Deeptech 45 is embedded. In order to do this, we assumed that the Deeptech 45 represent a distinct subnetwork on Twitter and that this network can be delineated by tracking Twitter follows. Since the total number of follows and followers from the Deeptech 45 is approx. 675k, we only looked at bi-directional follows. Figure 4 shows connections from the top 100 most central Twitter accounts and the Deeptech 45. Most Deeptech 45 accounts are not central in this network. The five most central accounts are highlighted in Figures 5-9, and they all appear to operate in the tech VC sector.
Based on profile descriptions and the high number of bidirectional follows, it seems likely that the mapped network represents a distinct subgroup on Twitter. However, since the Deeptech 45 are somewhat peripheral, the network may not be exclusively defined by deep technology and venture capital. Random sampling showed many accounts belong to female tech influencers and leaders, suggesting other defining characteristics.
Having identified the core of the Twitter network in which the Deeptech 45 are embedded we now want to study a more complete version of that network. In order to do that we applied the same method as before to the members of the primary network (i.e., map bilateral connections) and collected the 500 most central accounts. The final network shows a dense cluster with the Deeptech 45 still on the periphery (Figure 10 and 11).
We mapped information flow using the last 1,000 tweets on profiles within the secondary network (Figure 12). Despite their peripheral position, some Deeptech 45 members are central in terms of interactions (Figures 13-16).
Other heavily interacted accounts are shown in Figure 14, using the same measure (information centrality).
The account with the highest information centrality is @hunterwalk (Figure 15). This likely reflects both the quality as well as the frequency of their tweets.
Furthermore we identified 75 accounts that were heavily interacted with from within this network, while not being part of the network based on connectivity. These 'outside influences' are depicted in figure 16. They include major new sources (TechCrunch, NYT, Forbes and the Wall Street Journal) but also an account by a London-based individual named 'Harry Stebbings'.
The secondary network is mostly based in the San Francisco Bay Area, with some East Coast presence, suggesting it may be best defined as the "Silicon Valley VC Network" (Figure 18).
As a final analysis we looked at the Deeptech 45 and compared the number of in-group followers with information centrality within the secondary network. We assumed that there would be a linear relationship between these two measures, since VC firms with many followers are more likely to have their Tweets seen, re-tweeted and mentioned. In a way it could be said that firms that outperform (i.e. have a high information centrality despite having a low number of in-group followers) reach a higher 'rate of return' on their Twitter activity (e.g. NFX, Lightspeed). Presumably because their Tweets are of higher quality. Note however that this analysis is limited to the network defined above, it is possible that firms that 'under-perform' are in fact targeting a different Twitter community. As an example, Y Combinator appears to under-perform in this network, but it is in fact a major presence in the VC world and the network we mapped here might not be it's specific target audience.
This study shows that it is possible to identify and delineate a distinct Twitter subgroup and identify major sources of content within that network. Specifically, we mapped the Twitter network in which the Deeptech 45 is embedded. Future research should investigate why certain accounts achieve higher interaction rates per follower. Additionally, we aim to use topic analysis to map the flow of information through the secondary network in more detail.