We’re talking about Scraping Open Web Discussions, with Pedro Maioli, Data Analyst.
Scraping Open Web Discussions Agenda:
Reputation
Figurative vs. literal
AI rep vs. human rep
Reputation data collection methods
Traditional vs. online
Planned vs. spontaneous
Open Web reputation data
Some examples
Turning scraping into product
Cases
1. Digital You Can Trust |
PRESENTER: Pedro Maioli
DATE: Sept 2019
Data Science
2. Digital You Can Trust |
SCRAPING OPEN WEB DISCUSSIONS
PEDRO MAIOLI
DATA ANALYST
brazilian, 26
ccomp @ uesb '18
polyglotting (human || machine) languages
@mai0li
guitar player
glitch artist
3. Digital You Can Trust |
● Reputation
○ Figurative vs. literal
○ AI rep vs. human rep
● Reputation data collection methods
○ Traditional vs. online
○ Planned vs. spontaneous
● Open Web reputation data
○ Some examples
● Turning scraping into product
○ Cases
Contents
13. Digital You Can Trust |
Traditional Offline Methods
boring / inconvenient / invasive / don't see importance
14. Digital You Can Trust |
Online Methods
still boring
slightly less inconvenient
still little importance
15. Digital You Can Trust |
Methods / Landscape
Users 'inside dept. Store'
Willing to participate, but only if it's really quick
Still hard for companies to identify key points on reputation
Medium-to-hard to identify key points on product/service class
(not only client brand)
Customers hardly see the 'we value your opinion' concept behind
this
16. Digital You Can Trust |
Differ
Constant factor: users WEREN'T giving their opinion in first place;
They were ALL doing different things when asked;
They won't have a spontaneous or truly interested think about it.
What will they do?
17. Digital You Can Trust |
Conversations!
(reserved for big tech, though)
19. Digital You Can Trust |
(comments section)
(content produced)
(oh god please don't)
20. Digital You Can Trust |
landscape
3.48 billion social media users in 2019
(https://wearesocial.com/blog/2019/01/digital-2019-global-internet-use-accelerates)
71% of consumers who have had a good social media interaction experience with a brand are
likely to recommend it to others
(https://www.getambassador.com/blog/social-customer-service-infographic)
96% of the people that discuss brands online do not follow those brands’ owned profiles
(https://www.brandwatch.com/blog/marketing-dark-matter-social-media-and-the-number-
96/)
23. Digital You Can Trust |
Bad PR on OTA booking (not Expedia)
In-depth discussion
Free advertising by reddit user
Bad PR on Expedia
Free advertising by reddit user
Neutral mention of Expedia usage
Neutral mention of Expedia usage
Bad PR on OTA booking (not Expedia)
Potential client asking for advice
Freekeywordresearch
24. Digital You Can Trust |
Genuine feedback,
even from workers.
Overall + rep.
25. Digital You Can Trust |
Turning scraping into product / cases
a) The obvious 'social-media-management-tools' way
b) The 'create-your-platform' way
c) The insightfulness/'BI' way
26. Digital You Can Trust |
a) smmt
pros
- rapid interactions (likes, favs, shares, RTs)
- comments (that become discussions)
- sentiment analysis (bad rep, good rep)
- manage replies / crisis in one place
- time series
cons
- LOTs of agencies doing it
- dealing with hateful speech is a LOT stressful
- not something I'd really wish anyone in IMWT to directly work
(This is not us, but we want your
statistics)
27. Digital You Can Trust |
pros
- product has value regardless of SOW/brand contract
- deep impact in virtually every service business field
cons
- creating own metrics is technically challenging
- additional UX/UI research implementation costs
- heavy relying on user adoption / user-generated metrics
- takes lots of time to consolidate
b) cyp
28. Digital You Can Trust |
c) imwt
pros
- best of both worlds (technical + PR skills)
- additional value based on business intelligence + key findings
- M2M (?) relations (avoid directly competing another agencies)
- grants us freedom in testing/suggesting things
challenges
- the usual understand-the-data delays
- difficulty to keep 'innovative/reinventing' responsibility/spot
- relying on user-generated in-depth discussions over time
30. We’re a global online marketing
agency managed from one of
the finest beaches on the planet.
Digital You Can Trust |
GET IN
TOUCH
Notes de l'éditeur
Why do I want to talk about reputation? Reputation precedes transaction, in every sense of these words.
We can talk about the figurative sense - buying entity (user, company, gov) has a belief that the selling entity (again, user, company, gov) he/she/it is buying from is trusted - that he/she/it is going to receive a good quality product or service for the sum he's going to spend/invest in - yeah, that's figurative…
... or the literal way, which is usually handled/implemented using digital certificates issued by (again) trusted authorities, usually wrapped up by the payment processing company likes of VISA, PYPL, AAPL or even FBOK's ultimate take-over-the-world plan, Libra.
In many ways,
what I've caught myself doing was either selling the client (access to || knowledge about) its own reputation or even selling reputation itself!
(I mean, the improvement, the building of reputation through digital actions (internal linking analysis/directions) that improved major agents (search engines) beliefs in it.)
GARY ILLYES
You can't even see organic results unless you scroll down.
But - BUT - GOOG does not lie. If the user wants Expedia, he or she WILL have Expedia. A whole GOOG box. For Expedia.
My take is - if building reputation amongst users make a small fraction of users to search for the brand (like Airbnb does), they will still follow the click-the-organic route (Freddy was talking about how goog is turning into a answer engine instead of a search one - I remembered Elon Musk saying goog is like a mountain, you can climb it, but not move it).
It won't solve the problem, but it helps, it gives us time to figure out our next move.
"I know what you're thinking" sort of slide. But I sort of didn't, certainly not the extent I know nowadays, thanks to IMWT.
I've been working on the AI (machine) side of reputation during my time with you guys - but since this is a free theme choice, I tried to adventure into the other side - the human side of reputation.
Let's say our clients want to know what THEIR clients think about them. Clients' clients. Inception joke.
How do they use to do? How are they doing? How to differ?
Traditional "offline" methods included door-to-door/street surveying, telemarketing, and even SMS texting.
… traditionally, clients (or prospect clients) tend to think of these methods as boring, mildly inconvenient or even invasive.
Refusing to participate is a common answer to those.
And then, there was internet. Slightly less inconvenient, feedback pop-ups populated many websites, in all sorts of ways.
Of course, by then, customers are already aware of your service to some extent - sometimes even after the transaction ("how'd you enjoy buying with us?") - other times, just wandering on your website, similar to a person just walking inside a department store, for example.
The point is, they're actively interacting with the company in order to be prompted to giving feedback/telling us about the reputation we want to know. Everyone here have probably seen a "rate our app on Play Store/App Store" dialog. That's another good example. Again, the user has already downloaded the app, which only allows us to receive feedback from an user "inside" our dept. Store.
Recap'ing landscape // what can be improved
I credit these unsolved practices problems to one factor.
The sole aspect of answering a survey about one's concept of other's reputation might change the outcome.
(That's the thing about quantum computing as well! Qubits instead of bits, if you look at the qubit, it will change its value and you'll need to work on a prediction of it)
So where are we doing it? Guess what: humans came up this thing called "conversation", where you can group with friends and ramble about things we did or will do, like the location or platform we've chosen to book hotels for the next holiday. This can be done either AFK like these lovely people from stock photos at a bar or virtually, through the advent of messaging/social networking apps we can find our friends on, either through group chats or one-on-one conversations.
Unfortunately, this kind of data is usually private and protected by laws such as GDPR, unless you are FBOK or GOOG (technically, they DO share these, but in their own terms and making money off of it by selling API plans to automating tools).
Enter the open web: whenever we expand the scope of our conversations to a (search || index)able place, usually available to not only friends but acquaintances, people with common interests, brands or even total strangers.
Here's what I find magical about these: each ambient will obviously have a different userbase and focus.
Some of them will obviously hold angrier people,
we might find lots of bad PR or free hate,
but we also find genuine opinions from around the globe.
In some/most of them, it's culturally acceptable to weigh in the discussion even if you're total strangers.
Half of the world's population is connected
That invisible sign hanging on our neck 'make me feel important' that Maria mentioned yesterday is not only worn on the workplace.
A good interaction experience might be enough to get recommended
Brands need to go beyond their own profiles/channels/branded searches to perceive users' opinions
Here's a few examples from 2-3 days I've been checking Twitter (yeah, I do like Twitter), during August 28-29th.
Each conversation is unique, some we can even argue it's acceptable to interact with in as the company itself?
Lots of providers do that, media providers like NFLX or AMZN, credit card providers as well,
1st panel: acceptable example? User can be sympathized with, pointed to a support channel (Expedia does a nice job doing that from what I've seen in the past weeks)
2nd panel: (deja vu!) unacceptable example?
3rd example: unbranded search ("still need to plan"), opportunity to weigh in
4th panel: simple opportunity to endorse good PR through like/retweeting actions.
Also, look at how global open web discussions are: Japan, India, US, Europe, professional, personal...
Trying desperately to make people buy your product
Trying desperately to look cool and pulling a how do you do, fellow kids
Scheduling a billion automated posts that noone will engage because they're too specific for your product
The range of open web discussions is wide, and user engagement is very targeted on some places.
Detecting good reputation is as good as detecting bad reputation, as we can learn from both kinds of occurrences.
As reddit doesn't have the 280 chars per post restriction like previous Twitter example, I think it's suited for analysis of in-depth discussions.
(if this wasn't a coordinated inside job, give this guy a raise)
Turning this kind of research into a product can be done in many different ways, which I've chosen to aggregate into three classes for agencies: there's the social media listening/managing way, the 'create-your-platform' way, and the 'insightfulness/BI way. We'll discuss each one of these in more depth in the following slides, but for now let's keep in mind every one of these has its own set of fit tools and are usually biased towards delivering a specific outcome.
1st way: generating statistics through engagement, mentions, interactions, sentiment analysis, reach, amongst many other measures, using automated solutions for scraping of most social media platforms. Requires public relation skills, specially depending on whether you'll answer clients/prospect clients on behalf on the brand.
Requires less technical skills, given that there are many webapps/tools already implementing posts/interactions through social networking APIs (although there's space for custom scrapings or 'power insights' like the ones we do for internal linking analysis)
2nd way: create a platform that 'begs to differ', using in-house metrics. ReclameAqui (brazilian conflict-solving website) gained notoriety by creating its own ranked index of companies that would answer users' complaints on its own platform (aggregating clients + clients' clients on a single place, forced many companies to register and maintain a good answerability score on their platform). ReclameAqui won the Época Business Awards for 2 consecutive years now.
3rd way: Aggregating statistics from different sources into powerful business intelligence analysis and decision-making process support.
Talking to jack about the mgo and how it has different areas to measure businesses maturities
Cause from my understanding, we're targeting companies that would already have certain levels of social, for example they might have lots of pages across different social networks but they are not verified yet, or they are just scheduling posts and paying attention to the engagement on their channels but not doing research outside of it, and ideally they already have in-house people or a small agency involved so you can approach management and say 'hey, how can we help YOU help your social media team?'