laarcnew | comments | discord | tags | ask | show | place | submit | shawn's favoriteslogin

A variant of this algorithm is what glibc uses for its implementation of memmem[1]. I say "variant" because it uses a shift table in some cases.[2]

I found this original paper describing the algorithm to be notable, in particular for how rigorous it is. I've read it a couple of times so far, but I still don't grok everything. One thing I'm particularly interested in is trying to simplify the algorithm so that you don't need to handle the short and long period cases differently. In particular, the paper hints that this is possible:

> The string-matching algorithm of Section 2 uses the period of the pattern. A previous computation of this period is possible. This precomputation can be made easily by using Knuth, Morris, and Pratt’s algorithm on the pattern. This precomputation, combined with both the computation of a critical factorization of the pattern and our string-matching algorithm, leads to an algorithm that is globally linear in time and space. But, since the string-matching algorithm operates in constant space, it is desirable to improve on the precomputation of the period of the pattern, with the aim of obtaining a global algorithm that is in linear time and constant space. There are two ways to achieve that goal. One is a direct computation of the period by an algorithm operating in linear time and constant space. Such an algorithm is described in [12]. It can also be obtained as a consequence of the results in [17]. Our approach here is different. We shall describe a modification of our string-matching algorithm that avoids the use of the period of the pattern, giving an algorithm that is linear time and constant space on the whole. The point is that the exact value of the period of the pattern is actually needed only when it is less than half the length of the pattern. When the period of the pattern is larger, an even simpler string-matching algorithm is provided.

But they don't seem to explain why they went the route of approximating the period even when they cite algorithms that satisfy their desired complexity constraints (linear time, constant space). So my next task is to read the cited papers and hopefully discern for myself.

[1] - https://sourceware.org/git/?p=glibc.git;a=blob;f=string/memmem.c;h=4bf733f1f03cb27c289bd6dc61590909bb0eefdf;hb=HEAD

[2] - https://sourceware.org/git/?p=glibc.git;a=blob;f=string/str-two-way.h;h=b5011baafa77a2d211598be246657b9a33fd8a2e;hb=HEAD#l401

-----

4 points by rain1 619 days ago

Very good stuff, I learned about the duality between borders and periods and I understand how the KMP algorithm requires O(m) (pattern length) space to construct the table, but the critical factorization only takes up 2 integers O(1) so it's a much more space efficient matching algorithm.

-----

Ask Laarc: Ideas or advice on setting up a QA process, on a small team?
6 points by emily to ask 626 days ago | 7 comments
Show laarc: Text input with a gamepad [video] (bitchute.com)
6 points by enow to show dev mixedreality 635 days ago | 3 comments
Chicago train tracks set on fire to keep trains running in record-breaking cold (theweek.com)
6 points by Doreen to engineering trains chicago 628 days ago | 2 comments
Some Common Mitigation Techniques for Overload in Queueing Networks (wallaroolabs.com)
5 points by seantallen to distributed backpressure loadshedding 631 days ago | discuss

A "save to favorites" or "bookmark" feature, separate from the list that's generated by your upvote history, and longer term, the ability to organize them. Sometimes it's hard to go back and find things, and it would be cool to have the ability to organize research, etc. :)

-----


This was a big problem for me on various forums. I don't want one aggregator to control this on all sites I use either. I ended up keeping per-site folders with text copies of the good stuff with links back to originals. Each site letting one organize the material is a step up. API's that make that easy to automate might be next step up from there. Then, what goes into each site can be locally generated, customized for each per their style, and uploaded.

-----

4 points by shawn 655 days ago

I don't want one aggregator to control this on all sites I use either.

Definitely! This was partly the motivation for starting the site. Your bookmarks are yours, and it's important to us to avoid siloing data. The API might end up similar to https://github.com/HackerNews/API.

-----

2 points by shawn 647 days ago

Ok, there is an initial JSON API.

For any user or item, you can add .json to the endpoint. For example:

https://www.laarc.io/item?id=230

Becomes

https://www.laarc.io/item.json?id=230

And for users: https://www.laarc.io/user.json?id=nickpsecurity

I’ll add endpoints for tags like /l/news soon (and RSS feeds).

-----


Really cool. I like the simplicity of just adding .json. If I'm looking at that right, it has all the stories I submitted in JSON so I could just grab, store, or view them with an app?

-----

2 points by shawn 647 days ago

Correct. If you notice any problems, let me know.

-----


Good work and thanks! :)

-----

3 points by shawn 628 days ago

Ok, you can save stories and comments to favorites.

https://www.laarc.io/favorites?id=shawn

To save something to your favorites, click on an item's timestamp, then click "favorite" (at the top, next to the "parent" link).

If you visit someone's profile page, you can visit links to their submissions/comments/favorites.

-----


Awesome! This takes care of one of the prerequisites for https://www.laarc.io/item?id=230#313 :)

-----


re UFO's. I've seen two. One was, after interviewing pilots, a plane with unusual lighting at a weird angle. The other didn't matching anything I've ever seen or read about with shape and movement that were impossible. I kept interviewing pilots, what few I met. All of them claimed to have seen one, which all looked different, with behavior making them think same thing. They'd all say they screwed with the planes, too. Get close, far away, do nothing for a while, shoot off, do angles/rates with enough G's to kill, and so on. The only thing I can't eliminate is fact that many pilots have an insider culture plus a sense of humor that involves messing with people. It's actually possible those statements are a running joke among most of them on the rest of us. Leaning more in the other direction, though. I just have too little data to know what that means.

Now, far as wild programs government funded, I have two favorites with the evil one first. Blue Peacock, if it was real, just goes to show how much drugs people were doing at the time. Although they gripe about millennials, they aren't trying to plant friggin, nuclear, land mines in the ground while torturing and wasting perfectly good chickens.

http://www.todayifoundout.com/index.php/2016/04/cold-war-nuclear-chickens-defend-west-soviet-invasion/

The other one, which was implemented for a while, was remote viewers in programs like Operation Stargate. That got lots of internal attention. Even better, they made a hilarious, fictional movie that included quite a bit of real stuff from the program like the guy who tried to run through walls and the goat tests. Yall probably enjoy it. :)

https://www.youtube.com/watch?v=GC2TzspJn5A

-----

FreeBSD Desktop – Part 3 – X11 Window System (vermaden.wordpress.com)
2 points by vermaden to freebsd unix desktop x11 628 days ago | discuss
Please tell us what features you'd like in laarc
12 points by shawn to meta laarc 658 days ago | 82 comments

Welcome | Guidelines | Bookmarklet | Feature Requests | Source | Contact | Twitter | Lists

RSS (stories) | RSS (comments)

Search: