Hook Directionality and Exploring Data Lineage and Attribution

One of the problems I’ve been trying to solve lately is getting a better grasp on the lineage of the data I create.

In the small scale, I’d like to see how a specific piece of content converts into other content. For example, a news article for a certain kind of productivity technique that I have come across in Apple News or a newsletter may turn into an OmniFocus task that I use to remind me for later delving into the content, which then turns into a metadoc in Obsidian where I go deeper into thinking about the content from which I get an idea for extracting gems from it that could become an article, a flash card deck, or perhaps a Shortcut, which then leads me to create an OmniFocus item to do a search for something else and so on. As I get deeper into my thinking, I want to be able to see from where I’ve started, with clear attribution along the way. Over the longitudinal view of an item’s lineage from initial capture to its terminus points, I can develop a sense of the full pipeline through which something has been processed.

In a larger scale, I’d love to be able to see patterns in the linkages. One pattern I would like to be to see is which content sources (websites, newsletters, books, etc) are more or less generative than others. This would be a kind of emergent evaluation of an item’s or set of items’ Utility from CUPA framework in Cognitive Productivity. On the upstream side of information flow, this could expose potential bias I have toward certain sources in my field of view/ecosystem, help me target my attention to potentially higher-value (i.e. higher generativity) content sources, or perhaps find ways to simplify lower-value sources.

On the downstream side of patterns in the linkages, this could help me find termination points in my data processing. Maybe I could find that most times I put content into DEVONthink, I don’t tend to turn that into something else whereas when it goes into OmniFocus, it does, so maybe I should use that instead.

Another pattern I would like to see are the kinds of transformations that the data goes through in my system. On the small scale of this, I could see how often hooks are created between two content types, and in which direction. For example, I would be able to see how often does a Day One journal entry turn into an OmniFocus project. Or an Apple News article into an Obsidian doc.

I’d also love to have a sense of the patterns of processing across the whole lineage of data in a kind of pipeline view. Intuitively, I know that I have certain flows that I work through; this would be good to see which “cow paths” are emerging and how they line up to my expectations. Add onto this some on-device machine learning, and I could get intelligent suggestions a la Can Hook eventually make intelligent suggestions?. If I found a pattern that worked really well, I would love to be able to extract that into a kind of ‘Hook pipeline template’ that I could use to create repeatable workflows. This would be a kind of rules-driven way for Hook to more effectively guide me on some of these Cognitive Productivity paths.

From a social learning perspective, I would also love to see other community members’ patterns as described above (with their consent to sharing this, of course). If we each have this ability to see individually, we could share these in this forum. However, if we can choose to consent to our anonymized information being shared from the app, this might be useful for intelligent suggestions, too. This would be good for people to discover new kinds of links they haven’t considered before, or perhaps even apps that might be helpful that they don’t have yet. (And I’m sure the app developers wouldn’t mind, either.) I would assume that these would be settings you could customize if you didn’t want to see any of this either.

In order to do this, however, I would need to capture some sense of directionality with each Hook to explain the relationship between each item beyond the fact that they’re just linked. I’d need to have a sense that one piece of data generated the next. This causal relationship would be ideally captured at the moment of making the Hook.

1 Like

Fascinating. I was a software team leader at SFU on Prof Phil Winne’s gStudy and nStudy projects for nearly a decade (2002 to 2009-12). (I still check in on that every now and then.) Phil was very interesting in this kind of question. There we built a logging tool and a log analyzer. The apps aimed to provide comprehensive logging of user interactions (with user’s consent of course). Phil’s nStudy R&D projects are still active.

Hook creates a [~/Library/Application\ Support/com.cogsciapps.hook/log.csv](hook://file/J7O5AaGzC?p=QXBwbGljYXRpb24gU3VwcG9ydC9jb20uY29nc2NpYXBwcy5ob29r&n=log%2Ecsv) file in the user’s Library folder that could be used by an analysis tool. These data are not shared with CogSci Apps of course, they stay in user’s library folder. This does not explicitly capture the direction of links, as Hook does not have a model/DB layer notion of directed links. That tool could also query Hook’s database.

The log.csv file is not officially documented or supported, meaning it’s not guaranteed to be correct or complete. However it can be used for various purposes. I for one sometimes look into it, both as a regular user of Hook and for testing purposes.

Thanks for the pointers, both to Prof Winne’s projects and to the log.csv file. I’ll check out the content and see what I can report back on the support.

Expanding on the use cases from my earlier post, being able to literally follow our train of thought back has a lot of applications. While I shared the personal productivity ideas in my original post, there are more. As a consultant, I need to be able to trace the flow of ideas back to their ‘source’, so my clients can see and understand the rationale for themselves. I could see this being important in academic circles as well as journalism or finance, too.

Beyond attribution and analysis, this kind of tracking could be helpful for folks with ADHD and other neurodivergences to see how they arrived where they are, find their way back to their starting point, and then reclaim their focus. On occasion, I’ve used apps like Timing to go back in time and see what was happening to help me refocus, but it still doesn’t connect the content together. There are plenty of other tools to help support and maintain focus, too, but they each have leaks in some way. This is also giving me some motivation to return to my “auto-Hook” setup that I had running for a while, which would monitor the front-most window for content changes and would Hook items together if I lingered for longer than 5 or 10 seconds; it’s a kind of “hook first, assess relevance later” kind of approach. :slight_smile:

Apart from use cases, I’d also like to offer some perspective on the following statement you made about the current support for logs above:

The log.csv file is not officially documented or supported, meaning it’s not guaranteed to be correct or complete. However it can be used for various purposes. I for one sometimes look into it, both as a regular user of Hook and for testing purposes.

I find that logs deserve the same focus on data quality, data integrity, and reliability as first-class content as much as the rest of the apps they support. (For context, I have both hands-on and management experience with DevOps, developing enterprise data pipelines, and managing and improving data integrity for major companies and their flagship products.) This approach allows for more effective automation, monitoring, and analytical capabilities for the more technical users of systems (including but not limited to ops teams), which can both unburden the original developer (you, in this case) from additional feature development as well as inspire it. I appreciate that support and documentation haven’t been a priority for you, however, I would like to offer my advocacy for that.

In the mean time, might I request that you add even a single free-form text field to a Hook between two links that we can use at our discretion to add a note? :slight_smile:

1 Like

Thank you for sharing more about your usage and usage needs, @Rigorjunky and the suggestions. Much appreciated. (I’m intrinsically interested in the subjects, and of course it’s necessary for Hook product road mapping). I am also of the school of thought that log files are important. I will re-raise this discussion that we have had in house. And am raising the “annotate links” suggestion internally.

1 Like

Thanks, @LucB. It’s always a pleasure interacting with you.

1 Like