Data Acquisition andManagement

FIND A SOLUTION AT Academic Writers Bay

DATA4200
Data Acquisition and
Management
Lesson 11
[Industry Applications in Neo4j]
Lesson Learning Outcomes
1
Review the concepts relating to
Recommendation engines
2
Apply Neo4j to an Industry application
Today’s Business problem and data
• Business problem: Making a recommendation engine for
Supermarket customers
• Data: Seven CSV files on the portal (week 11)
– categories.csv
– customers.csv
– employees.csv
– orders.csv
– products.csv
– suppliers.csv
– order-details.csv
This Photo by Unknown Author is licensed under CC BY-SA
Weekly topic’s introduction activity
• Today we will be designing recommendation engine using
graph database Neo4j
• Before we do let’s have 5 minute brainstorming activity as
follows.
• Suppose that you own a supermarket. If you were to design
a recommendation engine, what data would you need?
Personalisation and
Recommendation Engines
• How do companies like Amazon, YouTube, and Netflix know
what “you might also like”?
• How do these companies know what people like you like?
• Increasingly, Big Data is harvested to provide real-time
personalization of products & services
• Amazon: know the types of consumer goods and books you
like
• Netflix: suggests movies and TV shows you might want to
watch
https://mitpress.mit.edu/books/recommendation-engines
Personalisation and
Recommendation Engines
Amazon Personalize &
Recommendation Engine
• Applies Machine Learning on Big Data to
deliver a wide array of personalization
experiences:
o Product recommendations
o Direct marketing
• Idea: create delightful user experiences
through improve engagement and user
conversions
https://mitpress.mit.edu/books/recommendation-engines
Amazon Personalize &
Recommendation Engine
Integration with:
• Apps, websites
• Emails and SMS
• Marketing automation
Watch and answer the
following
• What are the benefits of
this recommendation
engine for the customer?
• How is A/B testing used in
this case?
Big Data & NoSQL Graph Databases
• NoSQL databases are faster than traditional structured databases
and allow for complex analyses
• Good for diverse (messy) data sets
• Focus on relationships
• Uncover relationships that would not otherwise be detected

What is a Graph Database… and Why Big Data Needs One


Product Recommendations with
Graph Database: Neo4j
• Typical retail dataset (traditional SQL databases):
– Customers: customer ID, phone etc…
– Products: product name, product ID, unit price
– Categories: product categories, description
– Orders: Order ID, shipping information
– Suppliers: company name, supplier ID
• May also have:
– Employee data: with direct contact sales, the who sold
the product or services?
– Employee performance e.g. KPIs
Step 1: Determine Entities &
Relationships
Decide on the entities and their relationships
• Entities:
– customers (with properties: customer ID, address, email,
mobile etc…)
– Products
– Product Categories
– Orders
– Suppliers
• Relationships:
– Customer [buys] Product(s) [made_by] Producer (suppliers)
This Photo by Unknown Author is
licensed under CC BY-SA
Step 2: Data Acquisition
• Data acquisition from SQL databases
– Direct export
– Via Application Programming Interfaces: APIs
• Conversion into CSV (rows and columns)
• Data pre-processing: cleaning and formatting data
– For example: using Python programming
• Store raw data in cloud storage
– Examples: Amazon S3, Wasabi, Microsoft Azure Data
storage, Google Cloud Platform data storage
Step 3: Insert & Store Data into
NoSQL graph databases
• Load data into NoSQL graph databases and set properties:
– LOAD CSV WITH HEADERS FROM “CSV FILE” AS
ROW
– LOAD CSV data
– Which has column headers
– Reads every rows of data
Step 3: Neo4j
Step 4:Create Relationships between
entities
• Between customers and orders:
Step 4:Create Relationships
between entities
• Between orders & products:
Step 4:Create Relationships between
entities
• Between products & suppliers:
Step 4:Create Relationships between
entities
• Between products & product categories:
Step 4:Create Relationships between
entities
• For employees:
Step 5: Set additional properties
• Order details:
Query Database to Gain Business
Insights
• Which customer purchased what product & who
supplied it?
Which customer purchased what
product & who supplied it?
Query Database to Gain Business
Insights
• Find the supplier of the product and the product
categories?
Find the supplier of the product and
the product categories?
Team Sales Performance?
Team Sales Performance?
Recommendation Engines
• 35% of Amazon’s revenue comes from its recommendation
engine (see: https://www.mckinsey.com/industries/retail/ourinsights/how-retailers-can-keep-up-with-consumers#)
• There are two basic types:
– Content-base filtering
– Collaborative filtering
This Photo by Unknown Author is licensed under CC BY-SA-NC
Recommendation Engines
• Content-based filtering uses item features (entities) to recommend
other items similar to what the user likes (relationships), based on
the user’s previous actions (relationships) or feedback (feedback
entity)
– Advantages: the model relies solely on user data and not that other users making
it much easier to scale to every other user; captures the interests of a specific
user
– Disadvantages: requires domain knowledge to hand-engineer the features and
therefore can only be as good as the hand-engineering; recommendations are
limited to only the existing interests of the user
– See: https://developers.google.com/machine-learning/recommendation/contentbased/summary
Recommendation Engines
• Collaborative filtering:
– Addresses some of the short-falls of content-based
filtering by using the similarities between users and
the items simultaneously
– Enables “serendipitous” recommendations,
meaning that the model can make recommendations
based on the user’s similar interests and behaviours
– Automatic feature engineering
– See: https://developers.google.com/machinelearning/recommendation/collaborative/basics
Content-base Filtering with Graph
databases
Content-base Filtering with Graph
databases
Collaborative Filtering with Graph
databases
• Collaborative Filtering is a technique used by
recommendation engines to recommend content based on the
feedback from other Customers.
• That is, people who tended to agree in the past is likely to
agree in future.
• Example: Netflix make recommendations based on the ratings
given to shows a customer has already watched.
• The KNN (k-nearest neighbours) Algorithm. There are more
advance algorithms.
• KNN works by grouping items into classifications based on
their similarity to each other.
• This could be ratings between two Customers for a Product.
• For the model work there must be some feedback or interests
e.g. “ratings relationships”.
Collaborative Filtering with Graph
databases: Product Ratings
What are the Top 15 Similarities for
the User?
What are the Top 15 Similarities for
the User?
What are the Top 15 Similarities for
the User?
What are the Top 15 Similarities for
the User?
What to recommend for User?
What to recommend for User?
Business Cases
• Walmart: https://go.neo4j.com/rs/710-
RRC-335/images/neo4j-casestudywalmart.pdf?_ga=2.216181142.14293966
04.1518252687-1516255675.1518252687
• eBay: https://neo4j.com/pressreleases/ebay-walmart-adopt-neo4j-graphtransforming-retail/
• More case studies:

Case Studies


Next Week
• Assessment in class

Order from Academic Writers Bay
Best Custom Essay Writing Services

QUALITY: 100% ORIGINAL PAPERNO PLAGIARISM – CUSTOM PAPER