>>10155 >Concern with ipwb: adds a pin for every file in the WARC. Solution: compare pinset before and after indexing with ipwb; then do stuff so that recent ipwb pins are only under one pin instead of multiple. What I did for that, would be easier for other users/me if ipwb had this as a built-in feature (!!!!):
$ ipfs pin ls --type=recursive > p1.txt
$ ipwb index maremaremaremaremaremare.com-2023-07-11-999f817e-00000.warc.gz >> f1.cdxj
$ ipfs pin ls --type=recursive > p2.txt
$ cat p1.txt p2.txt > p3.txt; vim p3.txt
$ # vim - remove CIDs in both by running this: :sort and :%s/^\(.*\)\n\1$\n//g
$ ipfs files mkdir -p /web/warc/ipwb/data0000; ipfs files mkdir -p /cids/pins_in_mfs
$ cat p3.txt | sed "s/ .*//g" | h=$(wc -l p3.txt) xargs -d "\n" sh -c 'for args do grep -n $args p3.txt | sed "s/:.*//g" | tr -d \\n; echo -n "/"; echo $h; ipfs files cp /ipfs/$args /web/warc/ipwb/data0000/$args; ipfs pin rm $args; done' _
$ ipfs files ls --long /web/warc/ipwb # copy the CID for data0000, which is:
$ ipfs pin add --progress QmbrMh1Ku99kqPjuD7R8Q1QrsRyX44MM4k2Jt4m62YVAp1; ipfs add f1.cdxj
[...]
$ ipfs pin add /ipfs/QmVaQycySqU6ejZfPgQ7HmzV2kCaiT8XMmLpFxFkCPYVs7
[...]
$ h=QmbrMh1Ku99kqPjuD7R8Q1QrsRyX44MM4k2Jt4m62YVAp1; ipfs files cp /ipfs/$h /cids/pins_in_mfs/$h
$ h=QmVaQycySqU6ejZfPgQ7HmzV2kCaiT8XMmLpFxFkCPYVs7; ipfs files cp /ipfs/$h /cids/pins_in_mfs/$h
$ ipfs files cp /ipfs/$h /web/warc/ipwb/f1.cdxj