Skip to content

Commit 4ca963f

Browse files
committed
Improve Obsidian-style script graph metadata
1 parent 367d007 commit 4ca963f

12 files changed

Lines changed: 550 additions & 39 deletions

File tree

src/PrompterOne.Core/AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,5 +46,7 @@ It owns TPS parsing, compilation, export, RSVP helpers, preview and workspace st
4646
- AI script graph labels, descriptions, semantic scopes, tokenizer chunks, and knowledge-graph markdown input must come from compiled TPS display text through the TPS SDK so they match the clean prompter text; raw TPS source is allowed only for source ranges and structural metadata, not post-hoc string-cleaned visible prose.
4747
- AI script graph semantic extraction must use an available LLM/chat-client path as the primary extractor. Regex, stop-word, capitalization, keyword, or hardcoded-domain semantic heuristics are strictly forbidden; when no LLM graph extraction is configured, the app may either show no semantic graph or run only an explicit user-requested tokenizer/vector similarity fallback based on `Microsoft.ML.Tokenizers` token vectors and distance calculations.
4848
- Script graph tokenizer/vector fallback must use the tokenizer/vector primitives exposed by `ManagedCode.MarkdownLd.Kb` when that package provides them; do not keep a duplicate Core-owned tokenizer implementation.
49+
- Script graph extraction must track the current `ManagedCode.MarkdownLd.Kb` graph model and use its metadata/front matter primitives such as entity hints, graph groups, related links, next steps, focused graph/search, and token topics before adding PrompterOne-owned graph glue.
50+
- Script graph output should feel like an Obsidian-style writer knowledge graph: documents, people, topics, backlinks, related ideas, and neighborhood metadata are primary; TPS mechanics remain source metadata unless they explain writing meaning.
4951
- AI and agent services must not use ad-hoc language heuristics for semantic meaning, intent matching, or action similarity. Prefer LLM extraction; the only non-LLM fallback allowed for similarity or search is explicit tokenizer/vector similarity based on `Microsoft.ML.Tokenizers`.
5052
- Respect root maintainability limits. Large parser/compiler edits need explicit decomposition or documented exceptions.

src/PrompterOne.Core/AI/Services/ScriptKnowledgeGraphCompiledDocument.cs

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,14 @@ private ScriptKnowledgeGraphCompiledDocument(
1010
TpsCompilationResult compilation,
1111
string displayText,
1212
string displayMarkdown,
13+
string knowledgeMarkdown,
1314
IReadOnlyDictionary<string, string> segmentTextById,
1415
IReadOnlyDictionary<string, string> blockTextById)
1516
{
1617
Compilation = compilation;
1718
DisplayText = displayText;
1819
DisplayMarkdown = displayMarkdown;
20+
KnowledgeMarkdown = knowledgeMarkdown;
1921
SegmentTextById = segmentTextById;
2022
BlockTextById = blockTextById;
2123
}
@@ -26,6 +28,8 @@ private ScriptKnowledgeGraphCompiledDocument(
2628

2729
public string DisplayMarkdown { get; }
2830

31+
public string KnowledgeMarkdown { get; }
32+
2933
public IReadOnlyDictionary<string, string> SegmentTextById { get; }
3034

3135
public IReadOnlyDictionary<string, string> BlockTextById { get; }
@@ -45,11 +49,13 @@ public static ScriptKnowledgeGraphCompiledDocument Create(string sourceContent,
4549
StringComparer.Ordinal);
4650
var displayText = BuildWordText(compilation.Script.Words);
4751
var displayMarkdown = BuildDisplayMarkdown(compilation.Script, title);
52+
var knowledgeMarkdown = BuildKnowledgeMarkdown(compilation.Script, displayMarkdown, title);
4853

4954
return new ScriptKnowledgeGraphCompiledDocument(
5055
compilation,
5156
displayText,
5257
displayMarkdown,
58+
knowledgeMarkdown,
5359
segmentTextById,
5460
blockTextById);
5561
}
@@ -97,11 +103,158 @@ private static void AppendScopeText(StringBuilder builder, string text)
97103
builder.AppendLine(text).AppendLine();
98104
}
99105

106+
private static string BuildKnowledgeMarkdown(CompiledScript script, string displayMarkdown, string? title)
107+
{
108+
var frontMatter = BuildKnowledgeFrontMatter(script, title);
109+
return string.IsNullOrWhiteSpace(frontMatter)
110+
? displayMarkdown
111+
: string.Concat(frontMatter, Environment.NewLine, displayMarkdown);
112+
}
113+
114+
private static string BuildKnowledgeFrontMatter(CompiledScript script, string? title)
115+
{
116+
var metadata = CollectGraphMetadata(script, title);
117+
if (metadata.EntityHints.Count == 0 && metadata.Tags.Count == 0 && metadata.Groups.Count == 0)
118+
{
119+
return string.Empty;
120+
}
121+
122+
var builder = new StringBuilder();
123+
builder.AppendLine("---");
124+
if (!string.IsNullOrWhiteSpace(title))
125+
{
126+
builder.Append("title: ").AppendLine(QuoteYaml(title.Trim()));
127+
}
128+
129+
if (metadata.Tags.Count > 0)
130+
{
131+
builder.AppendLine("tags:");
132+
foreach (var tag in metadata.Tags)
133+
{
134+
builder.Append(" - ").AppendLine(QuoteYaml(tag));
135+
}
136+
}
137+
138+
if (metadata.Groups.Count > 0)
139+
{
140+
builder.AppendLine("graph_groups:");
141+
foreach (var group in metadata.Groups)
142+
{
143+
builder.Append(" - ").AppendLine(QuoteYaml(group));
144+
}
145+
}
146+
147+
if (metadata.EntityHints.Count > 0)
148+
{
149+
builder.AppendLine("graph_entities:");
150+
foreach (var hint in metadata.EntityHints)
151+
{
152+
builder.Append(" - label: ").AppendLine(QuoteYaml(hint.Label));
153+
builder.Append(" type: ").AppendLine(QuoteYaml(hint.Type));
154+
}
155+
}
156+
157+
if (metadata.Related.Count > 0)
158+
{
159+
builder.AppendLine("graph_related:");
160+
foreach (var related in metadata.Related)
161+
{
162+
builder.Append(" - ").AppendLine(QuoteYaml(related));
163+
}
164+
}
165+
166+
if (metadata.EntityHints.Count > 0)
167+
{
168+
builder.AppendLine("entity_hints:");
169+
foreach (var hint in metadata.EntityHints)
170+
{
171+
builder.Append(" - label: ").AppendLine(QuoteYaml(hint.Label));
172+
builder.Append(" type: ").AppendLine(QuoteYaml(hint.Type));
173+
}
174+
}
175+
176+
builder.AppendLine("---");
177+
return builder.ToString().TrimEnd();
178+
}
179+
180+
private static KnowledgeGraphFrontMatter CollectGraphMetadata(CompiledScript script, string? title)
181+
{
182+
var entityHints = new Dictionary<string, KnowledgeEntityHint>(StringComparer.OrdinalIgnoreCase);
183+
var tags = new SortedSet<string>(StringComparer.OrdinalIgnoreCase);
184+
var groups = new SortedSet<string>(StringComparer.OrdinalIgnoreCase)
185+
{
186+
"PrompterOne scripts"
187+
};
188+
var related = new SortedSet<string>(StringComparer.OrdinalIgnoreCase);
189+
AddValue(tags, "tps");
190+
AddValue(tags, "script");
191+
AddValue(groups, title);
192+
193+
foreach (var segment in script.Segments)
194+
{
195+
AddEntityHint(entityHints, segment.Speaker, "schema:Person");
196+
AddEntityHint(entityHints, segment.Emotion, "schema:DefinedTerm");
197+
AddEntityHint(entityHints, segment.Archetype, "schema:DefinedTerm");
198+
AddValue(tags, segment.Emotion);
199+
AddValue(tags, segment.Archetype);
200+
AddValue(related, segment.Name);
201+
202+
foreach (var block in segment.Blocks)
203+
{
204+
AddEntityHint(entityHints, block.Speaker, "schema:Person");
205+
AddEntityHint(entityHints, block.Emotion, "schema:DefinedTerm");
206+
AddEntityHint(entityHints, block.Archetype, "schema:DefinedTerm");
207+
AddValue(tags, block.Emotion);
208+
AddValue(tags, block.Archetype);
209+
AddValue(related, block.Name);
210+
}
211+
}
212+
213+
return new KnowledgeGraphFrontMatter(
214+
entityHints.Values
215+
.OrderBy(static hint => hint.Type, StringComparer.OrdinalIgnoreCase)
216+
.ThenBy(static hint => hint.Label, StringComparer.OrdinalIgnoreCase)
217+
.ToArray(),
218+
tags.ToArray(),
219+
groups.ToArray(),
220+
related.Take(12).ToArray());
221+
}
222+
223+
private static void AddEntityHint(IDictionary<string, KnowledgeEntityHint> hints, string? label, string type)
224+
{
225+
if (string.IsNullOrWhiteSpace(label))
226+
{
227+
return;
228+
}
229+
230+
var normalized = label.Trim();
231+
hints.TryAdd($"{type}:{normalized}", new KnowledgeEntityHint(normalized, type));
232+
}
233+
234+
private static void AddValue(ISet<string> values, string? value)
235+
{
236+
if (!string.IsNullOrWhiteSpace(value))
237+
{
238+
values.Add(value.Trim());
239+
}
240+
}
241+
242+
private static string QuoteYaml(string value) =>
243+
"\"" + value.Replace("\\", "\\\\", StringComparison.Ordinal).Replace("\"", "\\\"", StringComparison.Ordinal) + "\"";
244+
100245
private static string BuildWordText(IEnumerable<CompiledWord> words) =>
101246
string.Join(
102247
' ',
103248
words
104249
.Where(static word => word.Metadata.IsPause == false)
105250
.Select(static word => word.CleanText)
106251
.Where(static text => !string.IsNullOrWhiteSpace(text)));
252+
253+
private sealed record KnowledgeGraphFrontMatter(
254+
IReadOnlyList<KnowledgeEntityHint> EntityHints,
255+
IReadOnlyList<string> Tags,
256+
IReadOnlyList<string> Groups,
257+
IReadOnlyList<string> Related);
258+
259+
private sealed record KnowledgeEntityHint(string Label, string Type);
107260
}

src/PrompterOne.Core/AI/Services/ScriptKnowledgeGraphService.cs

Lines changed: 115 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
using System.Globalization;
12
using ManagedCode.MarkdownLd.Kb.Pipeline;
23
using Microsoft.Extensions.Logging;
34
using Microsoft.Extensions.Logging.Abstractions;
@@ -13,6 +14,7 @@ public sealed class ScriptKnowledgeGraphService(
1314
{
1415
private const string DocumentNodeId = "prompterone:document";
1516
private const string ContainsEdgeLabel = "contains";
17+
private const string MarkdownKnowledgeSource = "markdown-ld-kb";
1618
private readonly IScriptKnowledgeGraphSemanticExtractor? _semanticExtractor = semanticExtractor;
1719
private readonly ScriptKnowledgeGraphTokenizerSimilarityExtractor _tokenizerSimilarityExtractor = tokenizerSimilarityExtractor ?? new();
1820
private readonly ILogger<ScriptKnowledgeGraphService> _logger = logger ?? NullLogger<ScriptKnowledgeGraphService>.Instance;
@@ -28,7 +30,7 @@ public async Task<ScriptKnowledgeGraphArtifact> BuildAsync(
2830
var pipeline = new MarkdownKnowledgePipeline();
2931
var kbResult = await pipeline
3032
.BuildFromMarkdownAsync(
31-
compiledDocument.DisplayMarkdown,
33+
compiledDocument.KnowledgeMarkdown,
3234
CreateSourcePath(request.DocumentId),
3335
cancellationToken: cancellationToken)
3436
.ConfigureAwait(false);
@@ -57,7 +59,7 @@ public async Task<ScriptKnowledgeGraphArtifact> BuildAsync(
5759
nodes,
5860
edges,
5961
ranges);
60-
AddKnowledgeBankGraph(kbResult.Graph.ToSnapshot(), content, nodes, edges, ranges);
62+
AddKnowledgeBankGraph(kbResult, content, nodes, edges, ranges);
6163
var semanticStatus = request.SemanticMode == ScriptKnowledgeGraphSemanticMode.StructuralOnly
6264
? ScriptKnowledgeGraphSemanticStatus.StructuralOnly
6365
: await TryAddModelSemanticGraphAsync(
@@ -74,7 +76,7 @@ public async Task<ScriptKnowledgeGraphArtifact> BuildAsync(
7476
await _tokenizerSimilarityExtractor
7577
.AddTokenizerSimilarityAsync(
7678
content,
77-
compiledDocument.DisplayMarkdown,
79+
compiledDocument.KnowledgeMarkdown,
7880
nodes,
7981
edges,
8082
ranges,
@@ -145,21 +147,37 @@ private async Task<ScriptKnowledgeGraphSemanticStatus> TryAddModelSemanticGraphA
145147
}
146148

147149
private static void AddKnowledgeBankGraph(
148-
KnowledgeGraphSnapshot snapshot,
150+
MarkdownKnowledgeBuildResult result,
149151
string content,
150152
IDictionary<string, ScriptKnowledgeGraphNode> nodes,
151153
IDictionary<string, ScriptKnowledgeGraphEdge> edges,
152154
IDictionary<string, ScriptKnowledgeGraphSourceRange> ranges)
153155
{
156+
var snapshot = result.Graph.ToSnapshot();
157+
var factsById = result.Facts.Entities
158+
.Where(static entity => !string.IsNullOrWhiteSpace(entity.Id))
159+
.GroupBy(static entity => entity.Id!, StringComparer.Ordinal)
160+
.ToDictionary(static group => group.Key, static group => group.First(), StringComparer.Ordinal);
154161
foreach (var node in snapshot.Nodes)
155162
{
156163
if (IsVisualKnowledgeNoise(node))
157164
{
158165
continue;
159166
}
160167

161-
nodes.TryAdd(node.Id, new ScriptKnowledgeGraphNode(node.Id, node.Label, node.Kind.ToString(), "knowledge"));
162-
ScriptKnowledgeGraphSourceRanges.AddRangeIfFound(content, node.Id, node.Label, ranges);
168+
factsById.TryGetValue(node.Id, out var fact);
169+
var kind = ResolveKnowledgeKind(node, fact);
170+
var label = string.IsNullOrWhiteSpace(fact?.Label) ? node.Label : fact!.Label;
171+
nodes.TryAdd(
172+
node.Id,
173+
new ScriptKnowledgeGraphNode(
174+
node.Id,
175+
label,
176+
kind,
177+
"knowledge",
178+
CreateKnowledgeDetail(node, fact),
179+
CreateKnowledgeAttributes(node, fact)));
180+
ScriptKnowledgeGraphSourceRanges.AddRangeIfFound(content, node.Id, label, ranges);
163181
}
164182

165183
foreach (var edge in snapshot.Edges)
@@ -174,6 +192,97 @@ private static void AddKnowledgeBankGraph(
174192
}
175193
}
176194

195+
private static string ResolveKnowledgeKind(KnowledgeGraphNode node, KnowledgeEntityFact? fact)
196+
{
197+
var type = fact?.Type ?? string.Empty;
198+
if (ContainsType(type, "Person"))
199+
{
200+
return "Character";
201+
}
202+
203+
if (ContainsType(type, "DefinedTerm"))
204+
{
205+
return "Term";
206+
}
207+
208+
if (ContainsType(type, "Claim"))
209+
{
210+
return "Claim";
211+
}
212+
213+
if (ContainsType(type, "CreativeWork") ||
214+
ContainsType(type, "Article") ||
215+
ContainsType(type, "TextDigitalDocument"))
216+
{
217+
return "Story";
218+
}
219+
220+
return node.Kind switch
221+
{
222+
KnowledgeGraphNodeKind.Literal => "Literal",
223+
KnowledgeGraphNodeKind.Blank => "Custom",
224+
_ => "Entity",
225+
};
226+
}
227+
228+
private static bool ContainsType(string type, string value) =>
229+
type.Contains(value, StringComparison.OrdinalIgnoreCase);
230+
231+
private static string? CreateKnowledgeDetail(KnowledgeGraphNode node, KnowledgeEntityFact? fact)
232+
{
233+
var detailParts = new List<string>();
234+
AddDetail(detailParts, "type", fact?.Type);
235+
AddDetail(detailParts, "source", fact?.Source);
236+
if (node.Kind != KnowledgeGraphNodeKind.Uri)
237+
{
238+
AddDetail(detailParts, "rdf", node.Kind.ToString());
239+
}
240+
241+
return detailParts.Count == 0 ? null : string.Join(" | ", detailParts);
242+
}
243+
244+
private static IReadOnlyDictionary<string, string> CreateKnowledgeAttributes(
245+
KnowledgeGraphNode node,
246+
KnowledgeEntityFact? fact)
247+
{
248+
var attributes = new Dictionary<string, string>(StringComparer.Ordinal)
249+
{
250+
["source"] = MarkdownKnowledgeSource,
251+
["rdfKind"] = node.Kind.ToString(),
252+
};
253+
AddAttribute(attributes, "entityType", fact?.Type);
254+
AddAttribute(attributes, "sourceDocument", fact?.Source);
255+
if (fact is not null)
256+
{
257+
AddAttribute(
258+
attributes,
259+
"confidence",
260+
fact.Confidence.ToString("0.###", CultureInfo.InvariantCulture));
261+
if (fact.SameAs.Count > 0)
262+
{
263+
AddAttribute(attributes, "sameAs", string.Join(", ", fact.SameAs));
264+
}
265+
}
266+
267+
return attributes;
268+
}
269+
270+
private static void AddDetail(ICollection<string> details, string label, string? value)
271+
{
272+
if (!string.IsNullOrWhiteSpace(value))
273+
{
274+
details.Add($"{label}: {value.Trim()}");
275+
}
276+
}
277+
278+
private static void AddAttribute(IDictionary<string, string> attributes, string key, string? value)
279+
{
280+
if (!string.IsNullOrWhiteSpace(value))
281+
{
282+
attributes[key] = value.Trim();
283+
}
284+
}
285+
177286
private static bool IsVisualKnowledgeNoise(KnowledgeGraphNode node) =>
178287
IsTpsHeaderLabel(node.Label) || IsSchemaUriNode(node);
179288

src/PrompterOne.Shared/AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,8 @@
4444
- Script graph UI must expose an intentional analysis path when richer extraction is needed, and every graph mode should answer a concrete writer question such as "what is this about", "how are blocks connected", "which terms recur", "which references matter", or "where should I edit this idea".
4545
- Script graph UI must not silently fall back to regex/keyword semantic extraction when LLM graph extraction is unavailable; regex, stop-word, capitalization, keyword, and hardcoded-domain semantic heuristics are strictly forbidden. In that state, show an explicit action to connect AI or run tokenizer/vector similarity analysis; the tokenizer path must be user-visible as a lower-fidelity fallback, not disguised as LLM semantic understanding.
4646
- Script graph UI should surface tokenizer/vector similarity through the `ManagedCode.MarkdownLd.Kb` tokenizer path when available; do not add a separate browser or Shared-owned tokenizer implementation for the same fallback.
47+
- Script graph UI should read like an Obsidian graph workspace: expose document metadata, tags/topics, people, backlinks/links, node neighborhoods, and source-jump affordances around the canvas instead of showing only a decorative graph renderer.
48+
- Script graph UI must stay aligned with the current `ManagedCode.MarkdownLd.Kb` model; when that package adds graph metadata, focused search, entity hints, graph groups, related links, next steps, or token-topic primitives, PrompterOne should surface those concepts rather than flattening them into generic nodes.
4749
- Assistant spotlight suggestions and script graph UI must not use ad-hoc language heuristics for intent or similarity. Use LLM-backed semantics when available, or explicit tokenizer/vector similarity over localized action descriptions and the user query when a non-LLM fallback is needed.
4850
- Script graph editing must support a split source/graph workflow: the user should be able to inspect the graph beside the editor, click a meaningful graph node, and have the owning source range revealed or highlighted without leaving the graph workspace.
4951
- Script graph split mode must expose a user-resizable divider between source and graph panes, and graph view must also support a graph-only workspace mode for focused exploration.

0 commit comments

Comments
 (0)