foreign
okay so we are
hang on I'm gonna stop recording
hi everyone and welcome to the sequentum
webinar we are showcasing today our
third generation content Grabber
Enterprise product
um in case you guys are not aware
um I'm going to tell you a little bit
about sequenta and our story
um we basically started in 2010 with our
first generation product Visual Web
Ripper our CTO started this out of
Australia and uh over the course of
supporting uh many hundreds of customers
he realized that he
um had a lot of work to do and he
re-architected and rebuilt from the
ground up the content Grabber product
which came out he released that in 2015.
Matt got the attention of uh
quantitative and systematic hedge fund
investors in New York City
um and our Founders were flown to New
York and roquant Ventures which is the
BC arm of a quantitative systematic
hedge fund
um invested in the company and since
then in 2018 we set up a headquarters in
New York City
and the center of excellence in gurgaon
and India where we do a lot of the agent
development and product development
and we've grown to 40 employees and in
August of this year we were very proud
to announce the release of our third
generation product which is content
Grabber Enterprise which I'm going to be
showing you today
so content Grabber Enterprise uh you
know it was originally grew out it has
three components it has the desktop the
integrated development environment
which is where you write and maintain
all of your agents it has the servers
which are your workhorses that execute
all your jobs and then it has an agent
control center in the center that allows
you to manage your agents your agent
versions The deployments the runs the
schedules everything centrally including
your proxies
and tickets and we have a portal a web
portal that allows you to see the
history of runs for all of your agents
including key details like
um
success criteria the number of page
loads the number of Errors the data
count per run
per date and any Open tickets associated
with those agents
all right so now I'm going to jump right
in and give you a
a demo of our
of the desktop and I'll show you a
little bit of the agent control center
as well
so the desktop as you may know is a
Windows desktop and in it we have a
custom built version of the Chromium
browser just to give you a sense of of
how it works I'm going to bring up a
website that I want to write an agent
for
and what I'm doing is I'm bringing it up
inside our custom built browser inside
the tool and now you can see as I as I
moused around the page it's highlighting
various elements that I might want to
click on so I'm actually going to click
on this administrative support category
when you see content Garber knows
automatically the types of things that
you're going to want to do with that
element
so this time I'm actually just going to
click on it
now you can see in the top left corner
here it's opened up a second tab it's
keeping track of the flow of your agent
as it goes through the site and loads
dependent pages that provide more detail
um so this is great so far it's
automatically created your workflow it's
automatically created a schema in the
background for you
and now I'm going to go and start adding
to that schema this is a list a very
common
structure in our field of web data
extraction so I'm going to use my mouse
to Mouse over and click on the title of
one of these list items then I'm going
to Mouse over another one holding down
the shift key and it's automatically
going to create a list item I'm just
going to do a quick scroll down and see
did it miss anything no it got the whole
list just like that okay I'm going to
add that command now what it's done is
it's created the list item so it's
automatically detected all of the items
in the list on that page and it's
created a click through link to go to
the detail page which I'm going to do
now
so again see it loaded another tab for
that page it's clearly delineating
between the different
um dependent pages that I'm loading in
my workflow I'm going to go here and I'm
going to get the title
you see how it's creating the schema
automatically
I'm going to go here and I'm actually
I actually want to just transform this
content I really just want to parse out
the job ID so I'm actually going to
generate a regular expression
automatically just to pull that job ID
out and you can if you know regular
Expressions you know that it's pulling
the text that comes after a colon in a
space
have to write that stuff from scratch
that's really big time saver for your
engineers
here I'm just going to get the general
um
you know the job description
I call it JD
that's a lot of content
and now that's all I'm going to get
right now I'm showing you that there's
the schema in the background there's
these different tabs that it's done I'm
going to go ahead and save it yes now
I'm going to run it in debug
foreign
so you can see when I load it up in
debug mode what it's doing is it's
actually loading the page and displaying
it to me in real time so I can see
exactly what the agent is doing
um so it's actually clicking on the
various elements that I care about and
it's capturing the data and it's going
down the list automatically
it's actually moving really slowly
it's it's doing that because it has to
render the whole page usually it would
move a lot faster
I'm going to stop this
um because I don't think we need to
stare at all of that but I'm going to
view
the internal data whoops
why does that always happen
oh I know why because I didn't specify
any
let's see
so let's get my data exported here I'm
actually specifying
the export Target
and now I'm going to go ahead back over
here and I'm going to view the export
Target no still getting an error
here we go
so these are the different pages this is
all the internal data that it would then
process and create into his your title
your J your job ID and your job
description
this is what it would would export then
um now if I wanted to
I should just stop right here shouldn't
I I don't want