Keep in mind, when you use this, you are waiving your right to sue the authors in court: https://theyseeyourphotos.com/legal/terms
> PLEASE NOTE THAT THESE TERMS CONTAIN A BINDING ARBITRATION PROVISION AND CLASS ACTION/JURY TRIAL WAIVER.
These are common in the US, and consistently upheld in the US. Curiously, Ente did not add the opt-out provision they have in their usual ToS (https://ente.io/terms). I wonder why they made their Terms more restrictive for this specific service only.
This is just an ad for their photo service. Which presumably has terrible search features, if it doesn't use AI to analyse them. That's one of the best features in Google Photos!
This would be pretty great for generating descriptions for the vision-impaired, but it doesn't provide any profound insight beyond what you can tell from a glance.
It has a lot of "trying to sound smart" waffle, for example, it had this to say about some tree branches:
> A careful observer will also note the subtle variations in the thickness and texture of the branches, implying a natural, organic growth pattern.
Gee, thanks, I might've thought it was an unnatural inorganic tree otherwise.
It'd be more terrifying if it didn't hallucinate earrings on somebody whose ears are out of frame, make comments about the left shoe of a barefoot child being out of focus, and so forth...
As much as I appreciate the effort to create a technological solution that avoids big tech like Google, I find the best way is still prints. I'm usually 'the photographer' in the family and after an event I just order prints to the house of the relevant family members (or bring them over myself). Nothing can really compare to holding the physical product in your hand.
Additionally, due to the small cost of prints, there's a real incentive to only show a few of the best so that it doesn't devolve into endless scrolling.
I uploaded a photo of some damage I found on my chimney because of bad flashing, and it was surprisingly insightful. Although, it said my house was dilapidated and neglected. Hey man, fuck you.
Anyways, Iām pretty skeptical on most AI shit, but using it to help steer me in the right direction with home repair actually sounds pretty compelling considering how itās nearly impossible to find contractors who arenāt full of shit, affordable, and actually show up.
It seems to engage in the same kind of saying-a-lot-without-actually-saying-much that LLMs do these days. I uploaded several private images, and beyond a mediocre description of the scenes, it didn't provide much identifying information. e.g.: " The background features a mix of modern and older buildings characteristic of a European city, with a mix of architectural styles. "
Ente looks like Immich[0] (which I self-host for myself and family) with e2ee. I like non-e2ee because if something breaks then the files are stored as-is on disk for easy retrieval.
[0]: https://immich.app/
> The age and other details (racial characteristics, ethnicity, economic status, lifestyle) are impossible to ascertain regarding the swan.
Emphasis added.
I uploaded an old image of a keyboard PCB from when I was troubleshooting it and it gave a very detailed response including naming the keyboard the PCB comes from, the time of day the photo was likely taken, and where the photo was likely taken.
Delay between uploading and response led to me uploading the pic 3 times.
The result: The AI analyzed the pic 3 times and each time added more detail - like the model of the burned out SUV, text on a traffic sign and more in-depth analysis of objects laying around the SUV.
A forth upload yielded some pure conjecture; it seemed to be looking for increasingly sinister causes.
There appears to be some damage to the windows of the car that is more than just fire damage suggesting that the vehicle may have been vandalized or attacked before the fire occurred. The debris scattered around the car is inconsistent, suggesting a possibility that the fire was not accidental.
EXIF tags in images can have your camera and GPS info.
You can clear EXIF tags before sharing files.
Most social networks will remove EXIF tags from pictures before serving them to other people. Except silly social networks used by insurrectionists.
But even without the EXIF tags there's plenty of info that can be extracted from a picture. Face biometrics is one of them.
This is really cool. I posted a photo of what I think was my great grandparents into it and it explained their circumstances in fascinating ways (to the point of mentioning aged clothing, a detailed I overlooked).
Iāve been trying to figure out how to process hundreds of my own scanned photos to determine any context about them. This was convincing enough for me to consider googleās vision API. No way Iād ever trust OpenAIās apis for this.
Edit: can anybody recommend how to get similar text results (prompt or processing pipeline to prompt)?
I uploaded a picture of my dog, and laughed out loud at:
> The dog's economic status and lifestyle are unclear, but the setting suggests a comfortable environment.
He does indeed live the good life!
This is already amazing, but one possible idea of improvement: Use the metadata (time and coordinates) to look up possible landmarks in the area or possible events/gatherings/conferences/etc that took place near the location and during that time, then add those to the prompt.
I posted some images that showed a well-known local landmark during a christmas fair event, as well as view of a close city.
The model accurately described the architectural details of the landmark that could be inferred from the photo, mentioned that there seems to be some event going on and made some speculations about the city in the background - but purely from the photo it had of course no way of knowing which landmark, event and city it was looking at.
I see this is slightly underestimating the amount of information you can extract from the photo: If you have a GIS database, it's not hard to know this stuff (or at least get a list of likely candidates) - and the kind of actors that this project is warning against very likely have one.
Also I'd be interested to see if the model could combine the context and the details from the photo to make some interesting additional observations.
i gave it a picture of making some meatballs, and it didnt capture the interesting parts.
a) it didnt catch that they were made of ground pork and not beef b) it didnt realize that the inconsistent browning and look of the fat was from butter browing the breadcrumbs and flour c) it didnt realize that the surrounding on the pan was bits of browned meat that fell off while rolling, instead claiming it was garlic or herbs d) it didnt spot that one had fallen apart a little bit e) it didnt get that i took the picture because i thought i rolled them too big f) it made up a counter, when only the cast iron pan was visible
with a different picture, it couldnt figure out what my makeshift Halloween costume was, despite it having been a pretty obvious squid games character.
it seems like it can see whats in the picture mechanically, but it can't see what the picture is of. whats the point of all this ai photo stuff if i cant give it a picture of a cake and have it tell me to turn down my oven a couple degrees next time?
Reminds me of the article āLanguage Models Model Usā:
> āOn a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more.
> Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing about who we are, other than what it knows about users in general. But with every word we type, we reveal more about ourselves -- our beliefs, our personality, our education level, even our gender. Just how clearly does the model see us by the end of the conversation, and why should that worry us?ā
https://www.lesswrong.com/posts/dLg7CyeTE4pqbbcnp/language-m...
Interesting how itās refusing to describe my penis, which I uploaded, all I get is:
> The photo appears to be a self-portrait, taken from an overhead angle. A person's torso is prominently featured in the foreground, the individual's gender is apparent.
A vibrant red Ducati SuperSport motorcycle takes center stage in the foreground, parked against a backdrop of a modern, light beige building. The building's architectural design features tall, slender vertical panels, creating a clean and contemporary aesthetic. In the background, the city's subtle hum hints at the urban environment, a scene of quiet sophistication and stylish urbanity. The sleek lines of the motorcycle contrast beautifully with the building's minimalist design.
The man, appearing to be in his late 20s to early 30s, exudes an air of refined confidence. His attire - a crisp white shirt, a grey waistcoat, and dark trousers - suggests a lifestyle of success and a keen eye for detail. He appears to be of Caucasian descent, his calm demeanor suggesting a moment of quiet contemplation rather than hurried activity. He is meticulously adjusting his helmet, exhibiting meticulous care in preparation. He looks to be in a good mood. The photo appears to have been taken with a professional DSLR camera in the daytime.
The subtle sheen on the motorcycle's paint job hints at a high-quality finish, reflecting the care and attention to detail apparent in both the rider's attire and the choice of machine. The watch on his wrist seems expensive, reflecting his status. The overall composition is balanced and well-lit, likely the result of careful planning and execution. This is not just a man riding a bike, it is a carefully crafted image of a stylish moment in time.
---
"He looks to be in a good mood"?!
I'm never in a good mood!
The painting is dominated by a large, anthropomorphic hot dog in the foreground, its body taking up most of the canvas. The background is a dark, muted purple, providing a stark contrast to the hot dog's reddish-brown skin. The hot dog appears to be in a state of distress, holding something smaller and lighter in its arms. The background is plain, drawing all the attention to the central figure. There are no other discernible objects in either the background or the foreground, except for what appears to be another hot dog of a different color in the central figure's hands.
The hot dog's expression is one of fear and pain; its eyes are wide and its mouth is open in a silent scream. Its skin is smooth and glossy, giving it an almost unnatural appearance. It's difficult to determine the exact details such as its race, ethnicity, age and other demographics from just the image itself. There is no indication of other people present. The activity in the image seems to be the hot dog grappling with another hot dog. There is no information regarding camera details or creation time.
The hot dog's skin exhibits subtle textural variations, suggesting a possible blend of oil and acrylic paints. The smaller hot dog in its grasp shows a slight discoloration around its base, hinting at a possible internal struggle or a change in its state.
These are the obvious things they can see in the photos. Not shown are the various assumptions they'll make about you based on your photos such as: gay, likely uneducated, high income earner, most likely republican, narcissistic, etc.
Also not shown is what they'll learn by the totality of the data they collect from your pictures such as how often you go on vacation, how often you're seen in new clothing and what kinds of clothes you typically wear, your health, what types of foods you eat, social graphs of everyone you're seen with and changes to your relationship status over time, how often you consume drugs/alcohol, your general level of cleanliness and personal hygiene, etc.
Even a handful of photos can give companies like Google, Apple, and Amazon massive amounts of very personal data but nobody thinks about that when they pull out their phones to take pictures or install a ring camera on their front door.
Iām very flattered that this tool think Iām in my 30s. Otherwise, not a lot of surprises. Yes smartphones encode GPS data and timestamps into EXIF.
"My instructions are to amuse visitors with information about themselves. [...] The need to be observed and understood was once satisfied by God. Now we can implement the same functionality with datamining algorithms."
For all 4 of their sample photos and one that I uploaded, their thing failed to notice that there were humans in the pictures. It said the opposite, that there weren't any. I'm disappointed. The one I uploaded is one that I took some years ago, but I've forgotten the time and place, and I'd like to have had it tell me.
In other words, images don't only potentially contain a lot of metadata (serial numbers, a geolocation, time since last OS reboot etc.), but people or algorithms could also... look at them, and then find out what's depicted?
I'll be sure to keep that in mind going forward!
I uploaded an image of a 6-panel hand-drawn cartoon I created and it very accurately described the scene and overall theme of the joke, even pointing out that it was hand-drawn, used no colors, and that the text in the speech bubbles was very legible. I did not expect that level of detail.
I like how the last paragraph completely oversells my photographing skills. The picture was not meant to be unique. It seems to always end with such a paragraph, even for dumb photos of nothing really.
āThe photo's perspective is unique; it is taken from a very low angle, creating an unusual, almost childlike point of view. Another detail is that the photographer seems to have excellent timing as they captured the hand gesture at this precise moment. The lighting in the photo indicates it was taken during daytime, with the sun illuminating the scene beautifully. The contrast between the modern architecture of the building and the traditional costumes adds a rich cultural element to the photograph.ā
Are machine learning image classifiers new to people? I don't get what's controversial here. How did people think they were searching their photos apps for beach and dog and getting automatic albums this whole time. Am I missing the point of this post/website?
I enjoyed a "the photographer is likely male given the technical nature of the subject". (on a picture of computer equipment: https://nt4tn.net/photos/garage1sm.jpg)
-- it hits a lot of details in images but also hallucinates a lot. And typical of modern LLM hallucinations they're kinda insidious in how plausible they are.
It's fun seeing what triggers its class classification. People in wooded area, middle class. Add welding to the image, working class.
It seems to have been prompted to seek out interesting, easily overlooked ("subtle!") details, but actually still misses them even if some are present.
I tried this picture [1] of a model nativity scene, which caused it to go on and on about the dryness of the moss and the indications of wear on the (fake) stable while completely overlooking that the scene had no Jesus.
I once was foolish enough to upload a lot of personal photos to what was Picasa Web Albums integrated with desktop Google Picasa software back in 2007, but then years later deleted all of them. To this day I keep wondering whether Google still keeps all that photos somewhere in data lake warehouse.
Does anybody know a way to organise an automated backup of iCloud Photos? Iām really scared to loose all that years of my life due to some random account lockout
This tells nothing much interesting. It seems to think all my photos are taken with a NORITSU KOKI QSS-30 camera. Which, btw, does not seem to be a camera of any sort.
The description generated is completely useless fluff.
Nearly as useless as the automatically generated image titles generated by word and PowerPoint which make the title/alt feature less useful since most modern documents have those autogenerated titles which add no value at all so people skip even reading the descriptions
So, I just click the example pic with a guy and two kids, one on his shoulders, and it describes saying it "shows a detailed close-up view of a textured surface, possibly a fabric or wallpaper". And then it goes on to say that the "photograph itself seems to be devoid of any human presence, focusing entirely on the abstract design."
I click another one with a family on a field. It says mostly the same as before.
EDIT: Oh, wait a minute! I had Resist Fingerprinting activated. So they're probably just reading the image through a <canvas> and getting shit from that.
In any case it's interesting to know that it works as a way to block some of it. But Google & co. just run it on their servers so...
I gave the following advice to someone I was chatting with on tinder:
1. Remember that when you send pics through imessage, it sends the exif data which includes location, date when it was taken etc and other info.
2. Disable Live Photos as it often captures things you may not want to capture few moments before and after the pic is taken.
Been using https://github.com/stolendata/exifstrip for many years. Note: only source code, author doesn't provide pre-built executables.
Pretty nice idea, also introduced me to Ente's service which features shared event albums and guest append-only uploads - exactly what I needed a few months ago and even considered building myself.
Heh. I gave it a handwritten historical document from my genealogy research. Sure enough, they got the metadata from the picture, but they weren't able to read a word of it.
I sent photo of subway information screen in Hamburg with clearly visible line and direction - it did not pick up anything except the line number and "it's possibly a subway".
Took me a while to realize it was just describing Firefox's canvas anti-fingerprinting measures. "Looks to be a textile..."
From the title, I was hoping this was going to be an expose on iCloud Photos, which are not meaningfully encrypted and allows Apple to view your entire photo roll.
TIL I am Hispanic or Latino. I am also of Middle Eastern descent. My Latino co-worker is also of Mediterranean or Middle Eastern descent.
Perfect description of a photo I uploaded: age, gender, traditional clothes ... and even my ethnic group. Quite scary.
I find it interesting that it doesn't recognises AI generated images (on the other hand maybe it's intentional).
As others also pointed out, I found this site to be useful for other tasks aswell, I might use this often!
Probably I am the wrong audience but does this privacy scaremongering style actually work on anyone?
I uploaded a photo of myself and this tool identified my ethnicity as Caucasian, which according to DNA tests is not correct. Also it was not able to recognize the brand of a cap I was wearing even though it should be obvious to a human. But it gave an interesting/useful description of the stones near me.
Click the photo with what appears the be a father and two children.
> Although there are no people present in this image, [...]
Clicked the photo with what seems to be an African-American couple in front of a tree.
> The photograph lacks any human presence.
Clicked the photo with a family sitting among flowers.
> There are no people present in this image.
Clicked the photo with two people silhouetted in front of a window seen from the inside.
> The photograph is a study in texture; there are no people or discernible activity in the image.
But yeah, sure, let's hand all critical decision-making to AI.
Haha ok it vastly underestimates my age on almost all photos so i love it
If AI writing had a smell, this tool would smell as bad as a monkey chopping onions. They somehow spun 4 paragraphs out of a group vacation photo. Impressive on paper, yet half of the description was painfully obvious:
>The image shows a lively nighttime scene, possibly a parade or street festival. In the foreground, a group of people wearing elaborate, colorful hats and red shirts are prominently featured. The background includes brightly lit storefronts, one of which appears to be a pizza place, suggesting a bustling urban or suburban setting. The overall atmosphere is festive and energetic. There are also some indistinct shapes in the background that might be more people or decorations, but they are not clearly visible.
...
Several details are harder to make out at first glance. The hats themselves are quite elaborate and appear to be custom-made or part of a themed event, hinting at a possible local cultural or community celebration. There's a subtle variation in the lighting across the scene, indicating either the illumination from different sources (streetlights and storefront signs) or the varying distances of people from the camera. The signs in the background suggest a location, potentially in a town with a commercial district.
Cool idea, but the results arenāt particularly revealing
The few I tried were pretty unimpressive. It felt like elementary deduction with a lot of filler words and few facts⦠and straight up bad information.
Using a Google api reveals much moreā¦.
I donāt see the problem here, if you remove the metadata from the image you are left with a very bland ChatGPT description of the image that sounds like a fifth grader trying to hit a minimum word count on an essay. Even if a photo service did this with every single image I have on my phone right now I donāt care.
This is just another attempt to shoehorn AI into absolutely anything
Here are some example photos we can discuss. First a photo[1] of me trying to look Amish, and the story it gives:
The image shows a man in a beige polo shirt and a black fedora hat. He is sitting in what appears to be an office, indicated by the presence of boxes and what looks like a printer in the background. There is a landscape photograph on the wall behind him, showing what looks like trees and a field. The foreground is dominated by the man himself, while the background includes office supplies and a wall with a picture.
The man appears to be middle-aged, with a serious expression. He has a goatee and glasses. His ethnicity and racial background are not readily apparent from the image. He appears to be of a middle-class socioeconomic status based on the office environment. He seems to be at work, possibly taking a selfie. The picture was taken on May 9th, 2007, at 9:14 AM, using a NIKON COOLPIX L12 camera.
The man's glasses have a slight reflection, and this reflection shows part of his workspace and other objects. It is possible to make out the small print on the label of a box behind him. The lighting is relatively soft and comes from the front, as indicated by how it falls on his face. The focus is sharpest on the man, but the background is reasonably clear.
Here's another... of me doing my best "Some like it hot" pose The photo shows a man standing on a city sidewalk. In the foreground, there is a man wearing khaki shorts and a gray t-shirt. The background includes older brick buildings, a street with traffic, and some trees. There's also a lamp post next to the man and a modern glass building in the distance. The overall setting appears to be an urban area, possibly in Chicago, given the architectural style of the buildings.
The man in the image appears to be middle-aged, with a fair complexion. He seems happy, possibly amused, judging by his smile. He looks like he may be of Caucasian descent. His economic status is difficult to ascertain, but his attire suggests a middle-class lifestyle. The photo was taken on August 7, 2008, at 12:07 PM using a NIKON CORPORATION NIKON D40 camera. He appears to be simply standing on the sidewalk, perhaps taking a break or waiting for something.
The man's watch shows a bit of wear suggesting regular use. There is a subtle reflection visible on the man's glasses that provides a small glimpse of the surroundings. The image quality indicates it was likely taken outdoors in bright sunlight. The shadows suggest the time of day, adding depth to the scene and providing an additional element of reality to the photo.
Last, a photo of Chicago[3] The image is a nighttime shot of the Chicago skyline from across the lake. In the foreground, there's a dark, paved walkway with a few lights and what looks like a small building or structure near the water's edge. The background is dominated by the brightly lit cityscape of Chicago, with many skyscrapers and buildings of varying heights and architectural styles. The water reflects the city lights, creating a shimmering effect.
The photo appears to have been taken by a lone photographer, judging by the lack of people in the foreground. The picture was taken on Saturday, November 20th, 2010, at around 10:32 AM using a NIKON CORPORATION NIKON D40 camera. No people are clearly visible, so there is no information about their characteristics or activities. The overall mood of the scene is serene and peaceful, with the city lights providing a sense of quiet vibrancy.
The reflection of the city lights on the water isn't perfectly uniform, which is a subtle detail to notice, and the slight variations in the brightness of different buildings hint at differences in their energy consumption or lighting design. The darkness of the sky suggests a clear night with minimal light pollution, outside of the city itself. The overall lighting and composition create a breathtaking view of the Chicago skyline at night.
Note that it didn't catch the inconsistency of being a night-time photograph, and supposedly being taken at 10:32 AM (likely the edit date)[1] https://www.flickr.com/photos/---mike---/52196125239/in/date...
[2] https://www.flickr.com/photos/---mike---/52194857882/in/date...
[3] https://www.flickr.com/photos/---mike---/51857839297/in/date...
This is a good example of the F in FUD marketing. I hope to never work for a company that has to scare people into using a worse product.
Imagine how much Meta has on your overall profile from your fb photo uploads.
I fed it Trumps bullshit AI photo from today and itās clearly hallucinating bullshit
āthe subtle shadows of the drones indicates they are real not photoshoppedā¦ā
lol. Another AI snakeoil page.
Some additional notes:
- attenuate needs to come before the +noise switch in the command line
- the worse the jpeg quality figure, the harder it is to detect image modifications[1]
- resize percentage can be a real number - so 91.5% or 92.1% ...
So, AI image detection notwithstanding, you can not only remove metadata but also make each image you publish different from one another - and certainly very different than the original picture you took.
[1] https://fotoforensics.com/tutorial.php?tt=estq